Cs3492 Dbms MLM
Cs3492 Dbms MLM
3003
Purpose of Database System – Views of data – Data Models – Database System Architecture – Introduction to
relational databases – Relational Model – Keys – Relational Algebra – SQL fundamentals – Advanced SQL features –
Embedded SQL– Dynamic SQL
Q.1 What is Database Management System? Why do we need a DBMS? AU : May 05,Dec - 08
A Database Management System (DBMS) is collection of interrelated data and various
programs that are used to handle the data.
The primary goal of DBMS is to provide a way to store and retrieve the required
information from the database in convenient and efficient manner.
Q.2 What is the purpose of database management system ? AU : Dec 14
The purpose of database management system is –
1. Define the structure for storage of information.
2. Provide mechanism for manipulation of information.
3. In addition, the database systems must ensure the safety of information stored.
Q.3 List any two advantages of database systems AU : Dec 07
Following are the advantages of DBMS -
1) DBMS removes the data redundancy that means there is no duplication of data in database.
2) DBMS allows to retrieve the desired data in required format.
3) Data can be isolated in separate tables for convenient and efficient use.
4) Data can be accessed efficiently using a simple query language.
Q.4 Define data abstraction AU : May 05
Data abstraction means retrieving only required amount of information of the system and hiding background
details.
Q.5 What are three levels of data abstraction ? AU : Dec 02, 04,May 14, Dec 17
The three levels of data abstraction are –
1. Physical Level
2. Logical Level
3. View Level
Q.6 Is it possible for several attributes to have same domain ? Illustrate your answer with suitable example AU :
Dec 04, Dec 15
A domain is the set of legal values that can be assigned to an attribute. Each attribute in a database must have a
well-defined domain; we can’t mix values from different domains in the same attribute. Hence it is not possible for several
attributes to have same domain.
For example - Student domain has attributes RollNo, Name, Address. Similarly Employee domain has EmpID,
Ename, Salary, Address. We cannot define the same domain for defining several attributes.
Q.7 Discuss briefly three major disadvantages of keeping organizational information in a file processing system AU:
Dec 04, May 16
1) Data redundancy
2) Data inconsistency
3) Difficulty in accessing data
4) Data isolation
5) Integrity problems
6) Atomicity problems
7) Concurrent access anomalies
8) Security problems
Q.8 What is data model ? AU : Dec 11
It is a collection of conceptual tools for describing data, relationships among data, semantics (meaning) of data
and constraints.
Q.9 What are different types of data models ? AU : May 12
Various types of data models are –
(1) Relational Data Model (2) Entity Relational Data Model
(3) Object Based Data Model (4) Semi-structured Data Model
Q.10 Write the characteristic that distinguish the database approach with File based approach AU : May 15, Dec 16
OR
What are main differences between file processing system and a DBMS ? AU : May 06, Dec 06
Q.11 Name the categories of SQL commands AU : May 12
The categories of SQL commands are –
(1) Data Definition Language (DDL)
(2) Data Manipulation Language (DML)
(3) Data Control Language (DCL)
Q.12 What is data definition language ? Give example AU : Dec 16, May 18
Data Definition Language (DDL) is a specialized language used to specify a database schema by a set of
definitions.
It is a language which is used for creating and modifying the structures of tables, views, indexes and so on.
Some of the common commands used in DDL are -CREATE, ALTER, DROP.
Q.13 Give brief description of DCL command AU : Dec 14
DCL stands for Data Control Language. It includes commands such as GRANT and REVOKE which mainly deals
with the rights, permissions and other controls of the database system.
Q.14 Define the term tuple AU : Dec 05
Tuple means a row present in the table
Q.15 Why does SQL allow duplicate tuples in a table or in a query result? AU : Dec 15
Data can be the same. Two people may have the same name. Since SQL is a database where you store your data
and data can be duplicate.
But we can apply primary key constraints, Unique constraints or Distinct keyword to identify the record
uniquely
Q.16 Why key is essential? Write the different types of keys AU : Dec 04
Keys are used to specify the tuples distinctly in the given relation.
Various types of keys used in relational model are – Superkey, Candidate Keys, primary keys, foreign keys.
Q.17 Define primary key. Give example. AU : May 09
The primary key is a candidate key chosen by the database designer to identify the tuple in the relation uniquely.
For example – Consider a Student database as Student (RollNo, Name, Address). The primary key for this
database is RollNo. The primary is underlined.
Q.18 Define foreign key. Give example AU : May 18
Foreign key is a single attribute or collection of attributes in one table that refers to the primary key of other
table.
For example - Consider a Student database as Student (RollNo,Name,Address) and Course(CourseId,
CourseName, RollNo). Here RollNo is a foreign key
Q.19 What is the difference between primary key and foreign key ? AU : Dec 05
Primary Key Foreign Key
Primary key is a column or a set of columns that can be used to uniquely identify a row in a table
Foreign key is a column or a set of columns that refer to a primary key or a candidate key of another table.
A table can have a single primary key, A table can have multiple foreign keys that can reference different tables.
Q.20 What is referential integrity ? AU : May 04,08
The referential integrity rule states that “whenever a foreign key value is used it must reference a valid,
existing primary key in the parent table”.
Example : Consider the situation where you have two tables : Employees andManagers. The Employees table
has a foreign key attribute entitled ManagedBy, which points to the record for each employee’s manager in the Managers
table.
Q.21 What is domain integrity? Give example AU : Dec 08
Domain integrity ensures that all the data items in a column fall within a defined set of valid values. Each column
in a table has a defined set of values, such as the set of all numbers for zip (five-digit), the set of all character strings for
name.
Q.22 What are different types of integrity constraints used in designing relational databases AU : Dec 07
Different types of integrity constraints are –
(1) Entity Integrity Constraint
(2) Referential Integrity Constraint
(3) Domain Integrity Constraint
(4) Key Integrity Constraint
Q.23 List the reasons why null value might be introduced into the database AU : May 06
NULL is a special value provided by database in two cases –
i) When field values of some tuples are unknown(For e.g. city name is not assigned) and
ii) inapplicable(For e.g. middle name is not present).
Q.24 List various operators used in relational algebra AU : May 06
Various operators used in Relational algebra are –
(1) Selection Operator(σ)
(2) Projection Operator(Π)
(3) Cartesian Product()
(4) Rename Operator()
Q.25 Describe briefly any two undesirable properties that a database design may have ? AU : Dec 02
The two undesirable properties that a database design may have –
(1) Repetition of data
(2) In-ability of representation of certain information in database.
Q.26 Specify the different types of keys used in database management system. AU : Dec 02
Various types of keys used in relational model are – Superkey, Candidate Keys, primary keys, foreign keys.
Q.27 Define data independence. AU : May 08
Data independence is an ability by which one can change the data at one level without affecting the data at another
level. Here level can be physical, conceptual or external.
Q.28 Distinguish between Physical and logical data independence AU : May 03
1. Physical Independence : This is a kind of data independence which allows the modification of physical schema
without requiring any change to the conceptual schema. For example - if there is any change in memory size of database
server then it will not affect the logical structure of any data object.
2. Logical Independence : This is a kind of data independence which allows the modification of conceptual
schema without requiring any change to the external schema. For example - Any change in the table structure such as
addition or deletion of some column does not affect user views.
Q.29 What is meant by instance and Schema of the database AU : May 04, Dec 05
When information is inserted or deleted from the database then the database gets changed. The collection of
information at particular moment is called instances.
The overall design of the database is called schema
Q.30 Differentiate between Dynamic SQL and Static SQL AU: Dec 14,May 15, Dec 15, Dec 16, Dec 17
Static SQL Dynamic SQL
1 SQL statements are compiled at compile time. SQL statements are compiled at run time.
2 It is more efficient. It is less efficient.
3 It is less flexible. It is more flexible.
4 It is used in the situations where data It is used in situations where data
is distributed uniformly is distributed non uniformly
PART B
7. Consider the universal relation R={ A,B,C,D,E,F,G,H,I} and the set of functional dependencies
F={(A,B)->{C],{A}>{D,E},{B}->{F},{F}->{G,H},{D}->[I,J}.what is the key for Decompose R into 2NF,the 3NF
relations.
8. What are the pitfalls in relational database design? With a suitable example, explain the role of functional dependency
in the process of normalization.
PART B
1. Consider the following tables: Employee (Emp_no, Name, Emp_city)
Company (Emp_no, Company_name, Salary)
i. Write a SQL query to display Employee name and company name.
ii. Write a SQL query to display employee name, employee city ,company name and salary of all the employees whose
salary >10000
iii Write a query to display all the employees working in “XYZ company
2. Explain Embedded and Dynamic SQL>
3. Explain briefly about the steps required in query processing.
4. Explain the three kinds of database tunning.
5. Write the following
Nested loop join
Block Nested loop join
Merge join
Hash Join
UNIT III
TRANSACTIONS
PART A
Q.1 What is a transaction ? AU : May-04, Dec.05
A transaction can be defined as a group of tasks that form a single logical unit.
Q.2 What does time to commit mean ? AU : May-04
The COMMIT command is used to save permanently any transaction to database.
When we perform, Read or Write operations to the database then those changes can be undone by rollback
operations. To make these changes permanent, we should make use of commit
Q.3 What are ACID properties ? AU : May-05,06,08,13,15,Dec.-07,14,17
In a database, each transaction should maintain ACID property to meet the consistency and integrity of the
database. These are
(1) Atomicity (2) Consistency (3) Isolation (4) Durability
Q.4 Give the meaning of the expression ACID transaction. AU : Dec.-08
The expression ACID transaction represents the transaction that follows the ACID Properties.
Q.5 State the atomicity property of a transaction. AU : May-09,13
This property states that each transaction must be considered as a single unit and must be completed fully or not
completed at all. No transaction in the database is left half completed.
Q.6 What is meant by concurrency control ? AU : Dec.-15
A mechanism which ensures that simultaneous execution of more than one transactions does not lead to any
database inconsistencies is called concurrency control mechanism.
Q.7 State the need for concurrency control. AU : Dec.-17
OR
Q.8 Why is it necessary to have control of concurrent execution of transactions ? How is it made possible ? AU :
Dec.-02
Following are the purposes of concurrency control –
o To ensure isolation :
o To resolve read-write or write-write conflicts :
o To preserve consistency of database :
Q.9 List commonly used concurrency control techniques. AU : Dec.-11
The commonly used concurrency control techniques are –
i) Lock ii)Timestamp
iii) Snapshot Isolation
Q.10 What is meant by serializability? How it is tested? AU : May-14,18, Dec.-14,16
Serializability is a concept that helps to identify which non serial schedule and find the transaction equivalent to
serial schedule. It is tested using precedence graph technique.
Q.11 What is serializable schedule ? AU : May-17
The schedule in which the transactions execute one after the other is called serial schedule. It is consistent in
nature.
Q.12 When are two schedules conflict equivalent ? AU : Dec.-08
Two schedules are conflict equivalent if :
They contain the same set of the transaction.
every pair of conflicting actions is ordered the same way.
Q.13 Define two phase locking. AU : May-13
The two phase locking is a protocol in which there are two phases :
i) Growing Phase (Locking Phase) : It is a phase in which the transaction may obtain
locks but does not release any lock.
ii) Shrinking Phase (Unlocking Phase) : It is a phase in which the transaction may
release the locks but does not obtain any new lock.
Q.14 What is the difference between shared lock and exclusive lock? AU : May-18
Shared Lock Exclusive Lock
Shared lock is used for when the transaction. Exclusive lock is used when the transaction
wants to perform read operation wants to perform both read and write operation.
Multiple shared lock can be set on Only one exclusive lock can be placed
a transactions simultaneously on a data item at a time.
Using shared lock data item can be viewed. Using exclusive lock data can be inserted or deleted.
Q.15 What type of lock is needed for insert and delete operations. AU : May-17
The exclusive lock is needed to insert and delete operations.
Q.16 What benefit does strict two-phase locking provide ? What disadvantages result ? AU : May-06, 07, Dec.-07
Benefits :
1. This ensure that any data written by an uncommitted transaction are locked in exclusive mode until the
transaction commits and preventing other transaction from reading that data .
2. This protocol solves dirty read problem.
Disadvantage:
1. Concurrency is reduced.
Q.17 What is rigorous two phase locking protocol ? AU : Dec.-13
This is stricter two phase locking protocol. Here all locks are to be held until the transaction commits.
Q.18 Differentiate strict two phase locking and rigourous two phase locking protocol. AU : May-16
In Strict two phase locking protocol all the exclusive mode locks be held until the transaction commits.
The rigourous two phase locking protocol is stricter than strict two phase locking protocol. Here all locks are to
be held until the transaction commits.
Q.19 Define deadlock. AU : May-08,09,14
Deadlock is a situation in which when two or more transactions have got a lock and waiting for another locks
currently held by one of the other transactions.
Q.20 List four conditions for deadlock. AU : Dec.-16
1. Mutual exclusion condition 2. Hold and wait condition
3.No preemption condition 4.Circular wait condition
Q.21 Why is recovery needed ? AU : May-09
A recovery scheme that can restore the database to the consistent state that
existed before the failure.
Due to recovery mechanism, there is high availability of database to its users.
PART B
1. Explain in detail about Lock based protocols and Timestamp based protocols.
2. Write briefly about serializability with example.
3. Explain Two phase locking protocol in detail.
4. Write about immediate update and deferred update recovery techniques.
5. Explain the concept of Deadlock avoidance and prevention in detail.
UNIT IV
IMPLEMENTATION TECHNIQUES
PART A
Q.1 What is the need for RAID? AU : May-13
o RAID is a technology that is used to increase the performance.
o It is used for increased reliability of data storage.
o An array of multiple disks accessed in parallel will give greater throughput than a single disk.
o With multiple disks and a suitable redundancy scheme, your system can stay up and running when a disk fails,
and even while the replacement disk is being installed and its data restored.
Q.2 Define Software and hardware RAID systems AU : May-16
Hardware RAID : The hardware-based array manages the RAID subsystem independently from the host. It
presents a single disk per RAID array to the host.
Software RAID : Software RAID implements the various RAID levels in the kernel disk code. It offers the
cheapest possible solution, as expensive disk controller cards.
Q.3 What are ordered indices ? AU : June-09,Dec. -11,17, May-14
This is type of indexing which is based on sorted ordering values. Various ordered indices are primary indexing,
secondary indexing.
Q.4 What are the two types of ordered indices ? AU : Dec.-06
Two types of ordered indices are - Primary indexing and secondary indexing.
The primary indexing can be further classified into dense indexing and sparse indexing
and single level indexing and multilevel indexing.
Q.5 Give the comparison between ordered indices and hashing AU : Dec.-06
(1) If range of queries are common, ordered indices are to be used.
(2) The buckets containing records can be chained in sorted order in case of ordered indices.
(3) Hashing is generally better at retrieving records having a specified value of the key.
(4) Hash function assigns values randomly to buckets. Thus, there is no simple notion of “next bucket in
sorted order.”
Q.6 What are the causes of bucket overflow in a hash file organization ?
Bucket overflow can occur for following reasons -
(1) Insufficient buckets : For the total number of buckets there are insufficient number of buckets to
occupy.
(2) Skew : Some buckets are assigned more records than are others, so a bucket might overflow even
while other buckets still have space. This situation is known as bucket skew.
Q.7 What can be done to reduce the occurrences of bucket overflows in a hash file organization ? AU : May-07,
June-09, Dce.-12
(1) A bucket is a unit of storage containing one or more records (a bucket is typically a disk block).
(2) The file blocks are divided into M equal-sized buckets, numbered bucket0, bucket1... bucketM-1. Typically, a
bucket corresponds to one (or a fixed number of) disk block.
(3) In a hash file organization we obtain the bucket of a record directly from its search-key value using a hash
function, h (K).
(4) To reduce overflow records, a hash file is typically kept 70-80% full.
(5) The hash function h should distribute the records uniformly among the buckets; otherwise, search time will be
increased because many overflow records will exist.
Q.8 Distinguish between dense and sparse indices. AU : May-08, June-09
1) Dense index :
An index record appears for every search key value in file.
This record contains search key value and a pointer to the actual record.
2) Sparse index :
Index records are created only for some of the records.
To locate a record, we find the index record with the largest search key value less than or equal to the search
key value we are looking for.
We start at that record pointed to by the index record, and proceed along the pointers in the file (that is,
sequentially) until we find the desired record.
Q.9 When is it preferable to use a dense index rather than a sparse index? Explain your answer. AU: Dec. -11
1. It is preferable to use a dense index instead of a sparse index when the file is not sorted on the indexed field.
2. Or when the index file is small compared to the size of memory.
Q.10 How does B-tree differs from a B+ tree? Why is a B+ tree usually preferred as an access structure to a data
file? AU : Dec.-08
B-tree indices are similar to B+-tree indices.
The primary distinction between the two approaches is that a B-tree eliminates the redundant storage of search-
key values.
B-tree is a specialized multiway tree used to store the records in a disk.
There are number of subtrees to each node. So that the height of the tree is relatively small. So that only small
number of nodes must be read from disk to retrieve an item.
The goal of B-trees is to get fast access of the data.
A B-tree allows search-key values to appear only once (if they are unique), unlike a B+-tree, where a value may
appear in a nonleaf node, in addition to appearing in a leaf node.
Q.11 What are the disadvantages of B tree over B+ tree AU : Dec.-16
(1) Searching of a key value becomes difficult in B-tree as data cannot be found in the leaf node.
(2) The leaf node can not store linked list and thus wastes the space.
Q.12 Mention different hashing techniques. AU : May-12
Two types of hashing techniques are –
i) Static hashing ii) Dynamic hashing.
Q.13 List the mechanisms to avoid collision during hashing AU : Dec.-16
Collision Resolution techniques are :
(1) Separate chaining
(2) Open addressing techniques : (i) Linear probing (ii) Quadratic probing
Q.14 What is the basic difference between static hashing and dynamic hashing? AU : May-13, 15, Dec.-14, 15
PART B
1 How the records are represented and organized in files . Explain with suitable example
2.Write about the various levels of RAID with neat diagrams
3. Construct a B+ tree with the following (order of 3)
5,3,4,9,7,15,14,21,22,23
4. Explain detail in distributed databases and client/server databases.
5 Explain in detail about Dataware housing and data mining
6.Explain in detail about mobile and web databases
UNIT V
ADVANCED TOPICS
PART A
Q.1 Define Distributed Database Management system. AU : May-08,18, Dec.-16
A distributed database system consists of loosely coupled sites(computer) that share no physical components and
each site is associated a database system.
Q.2 What are two approaches to store a relation in the distributed database? AU : May-04
(1) Replication : System maintains multiple copies of data, stored in different sites,
for faster retrieval and fault tolerance.
(2) Fragmentation : Relation is partitioned into several fragments stored in distinct
sites.
Q.3 What are various fragmentations? State various fragmentations with example. AU : Dec.-17
There are two types of fragmentations – horizontal fragmentation and vertical fragmentation
o Horizontal Fragmentation : In this approach, each tuple of r is assigned to one or more fragments. If relation R
is fragmented in r1 and r2 fragments, then to bring these fragments back to R we must use union operation. That
means R = r1 r2
o Vertical Fragmentation : In this approach, the relation r is fragmented based on one or more columns. If
relation R is fragmented into r1 and r2 fragments using vertical fragmentation then to bring these fragments back to
original relation R we must use join operation. That means R = r1 r2
Q.4 What are the advantages of distributed databases ? AU : Dec.-04, May-08
(1) There is fast data processing as several sites participate in request processing.
(2) Reliability and availability of this system is high.
(3) It possess reduced operating cost.
(4) It is easier to expand the system by adding more sites.
(5) It has improved sharing ability and local autonomy.
Q.5 List out the reasons for development of distributed databases. AU : May-06
Ans: Following are the reasons for development of distributed databases –
(1) To control the data present at geographically different sites.
(2) To obtain highly available and reliable data processing systems
Q.6 Difference between Homogeneous and Heterogeneous Schema