0% found this document useful (0 votes)
19 views90 pages

DBMS Chit Something

The document covers various aspects of database management, including definitions of instances and schemas, database applications, integrity constraints, and ACID properties. It also discusses operations in relational algebra, normalization, and the roles of database administrators. Additionally, it provides SQL query examples and explains concepts like concurrency control and transaction management.

Uploaded by

jaceda2038
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views90 pages

DBMS Chit Something

The document covers various aspects of database management, including definitions of instances and schemas, database applications, integrity constraints, and ACID properties. It also discusses operations in relational algebra, normalization, and the roles of database administrators. Additionally, it provides SQL query examples and explains concepts like concurrency control and transaction management.

Uploaded by

jaceda2038
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 90

PART-A

Q1. Define Instances and Schemas of database:


Schema is the database structure definition; instance is the data stored at a
particular time. Schema stays constant, instance changes frequently.
Q2. List out four potential Database Applications:
Banking systems, airline reservation, university databases, online shopping
platforms.
Q3. What is domain integrity? Give example:
Ensures attribute values fall within predefined domain. Example: Age column
must contain integers only, not text or special characters.
Q4. Define SELECT operation in Relational algebra:
SELECT (σ) operation retrieves rows (tuples) from a relation that satisfy a
specific predicate or condition.
Q5. Demonstrate transitive dependency? Give an example:
Occurs when A → B and B → C, then A → C. Example: EmpID → DeptID, DeptID
→ DeptName, hence EmpID → DeptName.
Q6. Define BCNF:
Boyce-Codd Normal Form: A relation is in BCNF if every determinant is a
candidate key, eliminating redundancy and anomalies.
Q7. Define Two Phase Commit Protocol:
Ensures atomic transaction across distributed systems in two phases: prepare
and commit. All sites agree before committing changes.
Q8. Explain about multiple granularity:
Controls concurrency by locking data at various levels—database, table, page,
or record—to improve efficiency and reduce conflicts.
Q9. List different Normal Forms:
1NF, 2NF, 3NF, BCNF, 4NF, 5NF — each removing specific redundancy types and
improving relational schema quality.
Q10. What is the use of GROUP BY clause?
Groups rows sharing a property to perform aggregate functions like COUNT,
SUM, AVG on each group separately.
PART-A (additional set)
1. List different types of entities:
Strong entity, weak entity, associative entity, and recursive entity.
2. Draw architecture diagram of a DBMS:
(Text can't display a diagram; typically includes: User → Query Processor →
DBMS Engine → Database Storage.)
3. What is a file system?
A method to store and organize data in storage devices without using a DBMS,
lacks data abstraction and security.
4. What is Data Control Language?
Part of SQL managing access permissions. Includes GRANT and REVOKE
statements to control user privileges.
5. List properties of transactions in DBMS:
Atomicity, Consistency, Isolation, Durability (ACID).
6. Write advantages of DBMS:
Data integrity, security, reduced redundancy, data abstraction, concurrent
access, backup and recovery.
7. What is concurrency control?
Mechanism to ensure correct transaction execution simultaneously without
conflicts or data inconsistency.
8. What are functional dependencies?
A relationship where one attribute uniquely determines another. Example: A →
B, meaning B depends on A.
9. Explain the concept of constraints:
Rules applied to database columns to maintain data accuracy. Examples: NOT
NULL, UNIQUE, CHECK, PRIMARY KEY.
10. Write different types of DBMS:
Hierarchical, Network, Relational, Object-Oriented DBMS.

Part-A questions:
Q1. Advantages of DBMS:
Data redundancy control, improved data sharing, data security, backup and
recovery, data integrity, concurrency control, easy data access, centralized
management.

Q2. Features of ER Model:


Entities, attributes, relationships, constraints, keys, generalization,
specialization, aggregation, supports conceptual database design with
diagrams representing data structure.

Q3. Active Databases:


Databases with event-driven rules (triggers) that automatically execute
actions when certain conditions or events occur, enabling reactive behavior
and automation.

Q4. Need of Normalization:


Eliminate data redundancy, avoid anomalies, ensure data consistency,
improve data integrity, organize data logically, optimize database structure
for efficient updates.

Q5. Time-Based Protocols:


Concurrency control methods using timestamps to order transactions,
preventing conflicts without locks by ensuring transactions execute in
timestamp order.

Q6. Recovery Schemes After Failure:


Methods like deferred modification and immediate modification using logs to
restore database consistency by undoing or redoing transactions after
crashes.
Q7. Hierarchical Database:
Data organized in a tree structure with parent-child relationships; example:
IBM’s IMS system, where each child has one parent.

Q8. Types of Databases:


Relational, hierarchical, network, object-oriented, distributed, NoSQL,
centralized, data warehouses, temporal, multimedia databases.

Q9. ACID Properties:


Atomicity, Consistency, Isolation, Durability ensure reliable transaction
processing maintaining database integrity despite failures or concurrent
access.

Q10. Integrity Rules of RDBMS:


Entity integrity (primary key unique, not null), referential integrity (foreign
key consistency), domain integrity (valid attribute values).

Part-A questions:

Q1. Describe the ACID properties of transaction.


ACID ensures reliable transactions: Atomicity (all or none), Consistency (valid
data), Isolation (independent execution), Durability (permanent changes after
commit).

Q2. Describe the role of Database Administrator (DBA).


DBA manages database design, security, performance, backups, user access,
and ensures data integrity and availability for smooth database operation.
Q3. What is the difference between Trigger and Stored Procedure?
Trigger auto-executes on data changes; Stored Procedure is manually called.
Triggers enforce constraints, procedures perform reusable tasks.

Q4. What are the three types of anomalies? Explain with suitable examples.
Insertion (adding data causes issues), Deletion (removing data causes loss),
Update (inconsistent data). E.g., deleting employee deletes department info.

Q5. What is serializability? Explain with suitable example.


Serializability ensures concurrent transaction schedule is equivalent to some
serial order, preserving consistency. E.g., swapping non-conflicting
reads/writes.

Q6. Describe the features of E-R Models.


E-R Model represents entities, attributes, relationships graphically, supporting
data abstraction, constraints, and aiding database design.

Q7. What is the need of schema refinement and Normal Forms?


To reduce redundancy, prevent anomalies, and improve data integrity by
organizing schema through normalization and refinement.

- PART-B
Q.1 (a) State and explain various features of E-R Models.

(a) Features of E-R Model:


The Entity-Relationship (E-R) model is a high-level conceptual data model used
to design databases. Its key features include:

1. Entity Sets: Represent real-world objects (e.g., Student, Teacher).

2. Attributes: Describe entity properties (e.g., Student has RollNo, Name).


3. Relationships: Show associations between entities (e.g., Enrolls links
Student and Course).

4. Primary Keys: Uniquely identify entities in an entity set.


5. Cardinality: Specifies the number of instances involved (1:1, 1:N, M:N).
6. Participation: Indicates whether an entity's participation in a relationship
is total or partial.
7. Generalization/Specialization: Shows inheritance among entities.
8. Aggregation: Allows modeling relationships between relationships and
entities.
These features help in clearly understanding the database structure.

(b) Definitions with Examples:


1. Entity: A real-world object.
Example: Student
2. Attribute: A property of an entity.
Example: RollNo, Name
3. Relationship: Association among entities.
Example: Student "enrolls" in Course.

E-R Diagram Example

+------------+ Enrolls +----------+

| Student |------------------| Course |

| RollNo, | (M:N) | CourseID |

| Name | | Title |

Q.2 (a) Explain about Integrity Constraints over relations in detail?


Integrity constraints in relational databases ensure data accuracy and
consistency. The main types are:
 Domain Constraints: Ensure attributes have valid values from a
predefined domain.
Example: Age must be an integer between 18 and 60.
 Entity Integrity: Ensures the primary key of a table is unique and not
null.
Example: Every student in a student table must have a unique RollNo.
 Referential Integrity: Maintains consistency between related tables
using foreign keys.
Example: A CourseID in the Enrollment table must exist in the Course
table.
 Key Constraints: Enforce uniqueness for candidate keys. Only one
primary key is allowed, but multiple candidate keys may exist.
These constraints prevent data anomalies and maintain trust in the database
system by ensuring correctness at all times.
Q.2 (b) b)Discuss correlated nested querics?
A correlated nested query is a type of subquery that uses values from the outer
query. Unlike simple subqueries, it executes once for every row selected by the
outer query. This makes it context-dependent and more dynamic.

Example:
SELECT e1.Name

FROM Employees e1
WHERE Salary > (SELECT AVG(Salary) FROM Employees e2 WHERE e1.DeptID =
e2.DeptID);
Here, the subquery depends on the DeptID of each outer row. It calculates the
average salary per department and filters employees with higher salaries than
their departmental average.
Such queries are useful for comparisons within groups or related data.
However, they can be slow due to multiple subquery executions.
Q.3 Explain about Selection, Projection, Rename, Division and Cartesian
product operations in relational-algebra with suitable examples?

Relational algebra includes several fundamental operations:


 Selection (σ): Filters rows based on conditions.
Example: σ(Age > 20)(Student)
 Projection (π): Selects specific columns.
Example: π(Name, Age)(Student)
 Rename (ρ): Renames relations or attributes.
Example: ρ(S → Student)(Student)
 Division (÷): Used to find tuples related to all values in another relation.
Example: Courses taken by all students.
 Cartesian Product (×): Combines all rows from two relations.
Example: Student × Course (results in all possible student-course
combinations)
These operations are the foundation for SQL queries and help manipulate data
effectively.
Q.4 (a) Determine the closer of the following set of functional dependencies
for a relation scheme R[A,B,C,D,E,F,G,H), F = {ABC, BDEF, AD→G, AH}

List the candidate keys of R.


Given:
R = (A, B, C, D, E, F, G, H)
F = {ABC, BDEF, AD → G, AH}
Closure:

 From ABC, we get A, B, C.

 BDEF gives D, E, F from B.


 AD → G adds G.

 AH adds H.
From ABC and BDEF, we get all attributes: A, B, C, D, E, F, G, H.
Candidate Key: ABC.
Q.4 (b) Determine the closer of the following set of functional dependencies
for a relational scheme RA, B, C, D) and FDs{ABC,CD,DA). List out the
candidate keys of R.
Given:
R = (A, B, C, D)
F = {ABC, CD, DA}
Closure of ABC:

 ABC gives A, B, C.
 CD not triggered as D missing.
 DA not triggered.

No full closure. Try BCD:

 CD → D, DA → A. Get full closure: A, B, C, D.


Candidate Key: BCD.

Q.5 (a) What is transaction? Explain the ACID Properties of transactions?


A transaction is a sequence of database operations performed as a single
logical unit. It ensures data integrity and consistency even during failures.

ACID Properties:
 Atomicity: All operations in a transaction complete or none do.
 Consistency: Ensures the database remains in a valid state before and
after the transaction.

 Isolation: Transactions do not interfere with each other’s operations.


 Durability: Once a transaction commits, its effects persist despite system
failures.
These properties are critical for reliable database management in multi-user
environments.
Q.5 (b) Explain about Database users and Administrators?
Database Users include:
 Naïve Users: Use interfaces without technical knowledge.

 Application Programmers: Write database access programs.


 Sophisticated Users: Use DB tools for data analysis.

 Specialized Users: Use complex, custom applications (e.g., scientists).


Database Administrators (DBA) manage the overall database system. Their
responsibilities include:
 Schema definition and maintenance.

 Performance tuning and backups.


 User access control and security.

 Ensuring data integrity and availability.

DBAs are crucial for smooth and secure database operations.


Q.6 (a) Explain in detail about Validation-Based Protocol?
The Validation-Based Protocol is a concurrency control method used in
transaction management. It operates in three phases:
1. Read Phase: The transaction reads values and stores changes in a local
workspace.
2. Validation Phase: The system checks whether the transaction’s updates
conflict with other concurrent transactions.

3. Write Phase: If validation succeeds, changes are written to the database.


This protocol is also called Optimistic Concurrency Control, assuming that
conflicts are rare. It is suitable for environments with low conflict probability, as
it avoids locking.

Q.6 (b) Explain in detail about the two-phase locking protocol?


The Two-Phase Locking (2PL) Protocol is used to ensure serializability in
transactions. It has two phases:
1. Growing Phase: A transaction may acquire locks but cannot release any.
2. Shrinking Phase: After releasing a lock, it cannot acquire any new locks.
Locks may be shared (read) or exclusive (write). The strict form of 2PL ensures
all locks are held until the transaction commits or aborts, preventing cascading
rollbacks.
2PL guarantees conflict serializability but may lead to deadlocks. Deadlock
detection or prevention mechanisms are often used with it.
Q.7 What is meant by closure of F? Where F is the set of functional
dependencies. Explain computing F+ with suitable examples.

The closure of F (F⁺) is the set of all functional dependencies that can be
inferred from a given set F using inference rules (Armstrong’s axioms:
Reflexivity, Augmentation, and Transitivity).
Computing F⁺ involves iteratively applying these rules to discover all implied
dependencies.
Example:
Given F = {A → B, B → C},
Then:

 A → B (given)
 B → C (given)
 A → C (by transitivity)
So F⁺ = {A → B, B → C, A → C}
F⁺ is useful in normalization and determining candidate keys
Part-B

Q.1 Given two relations:


Employee (Name, salary, dept no) and department (dept no, deptuame,
address) wirte a relational algebra expression for finding the sum of salaries
of all employees.

To find the sum of salaries of all employees using relational algebra, we can use
the aggregate function SUM. Given the relation Employee(Name, Salary,
Dept_no), the relational algebra expression is:

SUM_Salary ← 𝛾 SUM(Salary)(Employee)
Explanation:
The gamma (𝛾) operator is used for aggregation. Here, 𝛾 SUM(Salary) performs
the summation of the Salary attribute for all tuples in the Employee relation.
This expression returns a single tuple with the total salary paid to all
employees, regardless of department.
If we also want to find the sum of salaries grouped by department, the
expression becomes:

Dept_Salary ← 𝛾 Dept_no, SUM(Salary)(Employee)


This groups employees based on Dept_no and calculates the total salary for
each department separately. Relational algebra does not have built-in display
or print operations, so this result is a logical representation of grouped totals.
Q.2 Consider the relations given in question 1 above write a SQL query to
get count of employees in any department (using dept no).

To count the number of employees in each department using SQL, we use the
GROUP BY clause with the COUNT() function. The query is:

SELECT Dept_no, COUNT(*) AS Employee_Count

FROM Employee
GROUP BY Dept_no;
Explanation:
This query counts how many employees are present in each department. The
GROUP BY Dept_no groups the employee records based on the department
number. The COUNT(*) function counts all the tuples (rows) in each group,
effectively providing the number of employees in each department.
If you want to include department names from the Department table as well,
you can join the two tables:
SELECT D.Dept_name, E.Dept_no, COUNT(*) AS Employee_Count
FROM Employee E
JOIN Department D ON E.Dept_no = D.Dept_no

GROUP BY D.Dept_name, E.Dept_no;


This gives a more informative output showing department names and the
employee count in each.
Q.3 Consider the relations given in question 1. Write a SQL query to get the
name of employee with highest salary in each department.

To get the name of the employee with the highest salary in each department,
we can use a correlated subquery or ROW_NUMBER() function. Here’s a
standard SQL query using a subquery:
SELECT Name, Salary, Dept_no

FROM Employee E

WHERE Salary = (
SELECT MAX(Salary)

FROM Employee
WHERE Dept_no = E.Dept_no
);
Explanation:
This query finds employees whose salary is the maximum within their
department. For each employee E, the subquery finds the maximum salary in
E's department. If E’s salary matches that maximum, the employee is selected.
If multiple employees have the same highest salary in a department, they all
will be included.
Alternatively, using window functions (if supported):
SELECT Name, Salary, Dept_no

FROM (
SELECT Name, Salary, Dept_no,
RANK() OVER (PARTITION BY Dept_no ORDER BY Salary DESC) AS rnk
FROM Employee

) AS Ranked
WHERE rnk = 1;
This version ranks employees within departments by salary and selects those
ranked 1.

Q.4 What are partial functional dependencies? Explain with an example

Partial functional dependency occurs when a non-prime attribute is


functionally dependent on part of a candidate key, not the whole key. It
typically arises in relations where the primary key is composite (i.e., has more
than one attribute).
Example:
Consider a relation Student_Course(RollNo, CourseCode, Name, CourseName)
where:

 RollNo + CourseCode is the primary key.

 Name depends only on RollNo.


 CourseName depends only on CourseCode.
Here, Name and CourseName are partially dependent on the key. They depend
only on part of the composite key.
This violates 2NF (Second Normal Form), which requires that all non-prime
attributes be fully dependent on the entire candidate key.

To remove partial dependencies, we decompose the relation into:


1. Student(RollNo, Name)
2. Course(CourseCode, CourseName)

3. Enrollment(RollNo, CourseCode)
This decomposition eliminates redundancy and anomalies, ensuring each non-
prime attribute is fully dependent on a whole candidate key.

Q.5 Explain the concept of serializability with suitable, example.


Serializability ensures that the outcome of executing concurrent transactions is
the same as if those transactions had been executed serially, one after the
other, without any overlap. It is a key concept in concurrency control in DBMS.
Example:
Suppose we have two transactions:
 T1: Transfer ₹100 from A to B

 T2: Apply 10% interest to A and B


If both run concurrently without control, the balance updates might interleave
in a way that causes incorrect results.

Schedule 1 (Non-serial, non-serializable):

 T1 reads A
 T2 reads A

 T2 updates A
 T1 subtracts 100 and updates A → incorrect final A balance
Schedule 2 (Serializable):

 T1 completes
 T2 starts → result is correct
A schedule is conflict-serializable if it can be transformed into a serial schedule
by swapping non-conflicting operations. Serializability ensures data consistency
by avoiding race conditions or inconsistent reads/writes.
Q.6 Why is a transaction required to be atomic ? Explain using suitable
A transaction is required to be atomic to ensure that it is all-or-nothing. Either
all operations in a transaction are completed successfully, or none are. This is
one of the ACID properties of transactions (Atomicity, Consistency, Isolation,
Durability).
Example:
Consider a fund transfer between two accounts:

 T1: Deduct ₹500 from Account A


 T1: Add ₹500 to Account B
If the system crashes after deducting ₹500 but before adding to Account B,
without atomicity, ₹500 is lost. This causes inconsistency in the database.
With atomicity, if any part of the transaction fails, the entire transaction is
rolled back, restoring the original state. This guarantees that the database
remains consistent even in case of failures.
Atomicity is typically implemented using a log (write-ahead log). If a failure
occurs, the system uses the log to undo incomplete transactions during
recovery.
Thus, atomicity is crucial to preserve integrity and trust in operations involving
multiple steps.
Q.7 What is the purpose of creating views in DBMS ?
Views in DBMS are virtual tables created using a SQL query on one or more
base tables. They do not store data themselves but display data dynamically
based on the underlying tables.

Purpose of Views:
1. Data Abstraction: Hide complexity of joins or calculations by simplifying
queries for users.
2. Security: Restrict user access to specific columns or rows of a table.
Example: A view may expose only employee names and departments but not
salaries.
3. Logical Data Independence: Base tables can change without affecting
user interaction, as long as the view remains valid.

4. Customized Data Representation: Present data tailored to user needs.

o Example: A sales view showing monthly sales per region.


5. Simplification: Help users by encapsulating frequently used queries.
6. Update Restriction: Views can prevent updates on sensitive data by
restricting write access.
Example:

CREATE VIEW EmployeeView AS

SELECT Name, Dept_no FROM Employee;


This view shows only employee names and departments, improving security
and usability.
PART C
Q.1
a) List at least five reasons why database systems support data
manipulation using a declarative query language such as SQL, instead of
just providing a library of C or C++ functions to carry out data
manipulation.

1. Abstraction of Data Access


Declarative languages like SQL allow users to specify what they want, not how
to get it. This abstracts the complexity of query execution, optimization, and
data retrieval, allowing the DBMS to handle the low-level details such as index
usage, join order, or data access paths.
2. Optimization by the DBMS
The query optimizer in a DBMS can analyze SQL queries and automatically
choose the most efficient execution plan. With procedural code (like C/C++
functions), the developer would need to handle performance manually, which
is error-prone and usually less efficient than the DBMS's built-in optimization
strategies.

3. Portability and Standardization


SQL is a standardized language (e.g., ANSI SQL), making queries portable across
different database systems with minimal changes. In contrast, a library of C or
C++ functions would be tightly coupled to the specific DBMS implementation,
reducing portability and increasing development effort.

4. Higher Productivity and Simpler Syntax


SQL is easier to learn and use for non-programmers or domain experts (e.g.,
analysts). It allows for concise expressions of complex operations, like joins or
aggregations, which would require extensive and error-prone code in C/C++.

5. Security and Access Control


Declarative query interfaces allow for fine-grained access control, where the
DBMS can restrict what operations a user can perform via roles and
permissions. Allowing direct manipulation through C/C++ APIs would bypass
these controls or make them much harder to enforce reliably.

b) What are five main functions of a database administrator?


1. Database Installation, Configuration, and Upgrades
The DBA is responsible for installing and configuring the database management
system (DBMS).They also handle upgrading the DBMS to newer versions and
applying patches or updates as needed.
2. Database Security and Access Control
DBAs ensure the security of data by managing user accounts, roles, and
privileges. They implement authentication and authorization policies, and
monitor access to protect against unauthorized use.
3. Backup and Recovery
A critical responsibility of a DBA is to plan and implement regular backups of
the database. They must also create recovery strategies to restore data in case
of hardware failures, data corruption, or accidental deletions.

4. Performance Monitoring and Tuning


DBAs continuously monitor database performance (e.g., response time, query
execution).They tune queries, optimize indexing, and adjust system resources
to improve overall efficiency and speed.
5. Data Integrity and Availability
DBAs enforce data integrity constraints (like primary keys, foreign keys, and
unique constraints).They ensure high availability of the database through
techniques like replication, clustering, or failover systems, minimizing
downtime.
c) Explain the difference between two-tier and three-tier
architectures. Which is better suited for Web applications? Why?
Difference between Two-Tier and Three-Tier Architectures

1. Two-Tier Architecture
Structure:
Client Tier: User interface (e.g., desktop application).
Database Tier: Database server that handles data storage and query
processing.
How it works:

The client communicates directly with the database.

Application logic is typically handled within the client.


Example: A desktop application that connects directly to a SQL Server
database.

2. Three-Tier Architecture
Structure:

Presentation Tier (Client): User interface (e.g., browser).

Application Tier (Middle Layer): Application server that handles business logic.

Data Tier (Database): Database server that stores and retrieves data.
How it works:
The client communicates with the application server, which then
communicates with the database.

Application logic is centralized in the middle tier.


Example: A web browser connects to a web server, which in turn queries a
backend database.
Which Is Better Suited for Web Applications?

Three-Tier Architecture is better suited for web applications.

Why?
Separation of Concerns
Divides responsibilities: UI, business logic, and data access are all handled in
separate layers.
This improves maintainability and scalability.
Improved Security, Direct access to the database is restricted.
The application server can implement access control, input validation, and
authentication.

Scalability

Web applications often handle many users simultaneously.


Three-tier systems allow horizontal scaling of the web and application servers
to handle more traffic.
Flexibility
Easier to modify or upgrade one tier without impacting the others (e.g., update
business logic without touching the UI).
Better Support for Distributed Systems
Web applications often run across multiple servers or cloud services. A three-
tier design aligns well with distributed and cloud-based infrastructures.
Summary Table

Feature Two-Tier Architecture Three-Tier Architecture

Application In the middle tier (application


In the client
Logic server)

Scalability Limited High

Higher (due to middle tier


Security Lower
control)

Maintenance Harder Easier due to modularity

Small LAN-based Large, scalable Web


Best For
applications applications

Q.2
a) List three reasons why null values might be introduced into the
database.
Here are three common reasons why NULL values might be introduced into a
database:

1. Missing or Unknown Data


A value might not be known at the time of data entry.Example: A customer
places an order but hasn't provided their delivery address yet.
2. Not Applicable Information
Some attributes might not apply to every record.Example: A spouse_name
field in an employee table would be NULL for single employees.

3. Data Entry or Input Omission


A field may be left blank accidentally or during data input.Example: A user
registration form where the "middle intentionally skipped name" is optional
and left empty.
These uses of NULL allow databases to handle incomplete or context-
dependent information gracefully.
Q.2 b) Discuss the relative merits of procedural and nonprocedural
languages.

Procedural vs. Nonprocedural Languages: A Comparison


Procedural and nonprocedural (declarative) languages offer different
approaches to problem-solving and are suited to different types of tasks. Here’s
a breakdown of their relative merits:

Procedural Languages (e.g., C, Java, Python)


Merits:

) Fine-Grained Control
a) Developers can control the exact sequence of operations, which is
useful for complex logic, algorithms, or data manipulation.

b) General-Purpose Use
c) Well-suited for a wide variety of programming tasks, from desktop
apps to system-level programming.
d) Strong Support for Control Structures
e) Includes loops, conditionals, and functions, enabling detailed
control of flow and logic.
f) Widely Known and Supported
g) Most programmers are trained in procedural languages, and
there's extensive community and library support.
Nonprocedural (Declarative) Languages (e.g., SQL, Prolog, HTML)

Merits:
h) Focus on What, Not How
i) The user specifies what outcome is desired, and the system
determines how to achieve it. This leads to simpler and more
readable code.
j) Less Code, Higher Productivity
k) Tasks that would require many lines of procedural code can often
be accomplished with one or two lines of declarative code.

l) Optimization by the System


m) Because the system handles execution details, it can optimize
performance internally, as in SQL query optimization.

n) Reduced Errors
o) By hiding procedural logic, there's less room for implementation
errors, especially for repetitive or complex operations. Comparison
Summary Table:

Nonprocedural
Feature Procedural Languages
Languages

Focus How to perform tasks What result is desired

Control over
High Low
execution
Nonprocedural
Feature Procedural Languages
Languages

Easier for specific


Ease of use Complex for large systems
domains

Optimization Manual Handled by the system

Algorithms, general Data queries,


Best used for
programming configuration

Q.2 c) Under what circumstances would the query


Select*
from student natural full outer join takes natural full outer join course include
tuples with null values for the title attribute?

SELECT *
FROM student
NATURAL FULL OUTER JOIN takes
NATURAL FULL OUTER JOIN course;
Q.2 d) Suppose user A, who has the authorizations on a relation r,
grants select on relation r to public with grant option. Suppose user B
then grants select on r to A. Does this cause a cycle in the authorization
graph? Explain why.
No, this does not create a cycle in the authorization graph.

Here's why:

User A already has full rights on relation r and can grant those rights to others.

User B got the SELECT right from A (through PUBLIC) and then gave it back to A.

But A didn’t need B’s grant—A already had the right independently.
So, A does not depend on B, which means there is no cycle. A cycle only
happens if users depend on each other for their privileges, which isn’t the case
here.
Q.3 Explain the purpose of the checkpoint mechanism. How often should
checkpoints be performed? How does the frequency of checkpoints affects?
a) System performance when no failure occurs?

b)The time it takes to recover from a system crash?


c)The time it takes to recover from a media (disk) failure?

Purpose of the Checkpoint Mechanism


A checkpoint is a point in time where the database system saves a consistent
snapshot of the database state to disk, including all necessary information (like
memory buffers and logs) so that recovery from crashes can be faster and
more efficient.
It helps the system avoid reprocessing the entire transaction log after a crash
by marking a "safe" state to start recovery from.

How Often Should Checkpoints Be Performed?

There is no fixed rule, but checkpoints are usually taken:

Periodically (e.g., every few minutes).


After a certain number of transactions.

Before shutdown or after a major update.


The frequency depends on the system’s needs for performance vs. recovery
speed.

⚖️ Effects of Checkpoint Frequency


Aspect Frequent Checkpoints Infrequent Checkpoints

System 🔻 May reduce


🔼 Better runtime
Performance (no performance due to I/O
performance (less I/O).
failure) overhead.

Recovery Time 🔼 Faster recovery (less log 🔻 Slower recovery


(System Crash) to replay). (more log to scan).

Recovery Time ⚖️ Little effect — full ⚖️ Same — depends


(Media/Disk backups and logs are more more on backups than
Failure) important here. checkpoints.

Q.4 Consider the following set F of functional dependencies on the relation


schema:
r (A, B, C, D, E, F): A→ BCD, BC→ DE, B→D, D→A

a) Compute B+,
b) Prove (using Armstrong's axioms) that AF is a superkey.
c) Compute a canonical cover from the above set of functional
dependencies F; give each step of your derivation with an
explanation.
d) Give a 3NF decomposition of r based on the canonical cover.
e) Give a BCNF decomposition of r using the original set of
functional dependencies.
To address the problem, we will go through each part step by step.
1. Compute \( B^+ \)

To compute the closure of \( B \) (denoted \( B^+ \)), we start with the


attributes in \( B \) and apply the functional dependencies in \( F \) until no
new attributes can be added.
Given \( F \):
- \( A \rightarrow BCD \)

- \( BC \rightarrow DE \)
- \( B \rightarrow D \)

- \( D \rightarrow A \)

Step 1: Start with \( B^+ = \{ B \} \).

Step 2: Apply \( B \rightarrow D \):


- Now, \( B^+ = \{ B, D \} \).

Step 3: Apply \( D \rightarrow A \):

- Now, \( B^+ = \{ B, D, A \} \).

Step 4: Apply \( A \rightarrow BCD \):


- Since \( A \) is now in \( B^+ \), we can add \( B, C, D \) (but \( B \) and \( D \)
are already there):
- Now, \( B^+ = \{ A, B, C, D \} \).

Step 5: Apply \( BC \rightarrow DE \):


- Since \( B \) and \( C \) are in \( B^+ \), we can add \( D \) and \( E \) (but \( D
\) is already there):
- Now, \( B^+ = \{ A, B, C, D, E \} \).

Thus, the closure \( B^+ = \{ A, B, C, D, E \} \).

2. Prove that \( AF \) is a superkey using Armstrong's axioms


To prove that \( AF \) is a superkey, we need to show that \( AF^+ \) contains all
attributes of the relation \( r \).

Step 1: Start with \( AF^+ = \{ A, F \} \).

Step 2: Apply \( A \rightarrow BCD \):


- Now, \( AF^+ = \{ A, B, C, D, F \} \).

Step 3: Apply \( D \rightarrow A \):

- \( A \) is already in \( AF^+ \).

Step 4: Apply \( B \rightarrow D \):


- \( D \) is already in \( AF^+ \).

Step 5: Apply \( BC \rightarrow DE \):


- Since \( B \) and \( C \) are in \( AF^+ \), we can add \( D \) and \( E \) (but \( D
\) is already there):

- Now, \( AF^+ = \{ A, B, C, D, E, F \} \).

Since \( AF^+ \) contains all attributes \( \{ A, B, C, D, E, F \} \), we conclude that


\( AF \) is a superkey.

3. Compute a canonical cover from the set of functional dependencies \( F \)


To find a canonical cover, we need to minimize the set of functional
dependencies while preserving the closure.

Step 1: Remove extraneous attributes from the left-hand side of each


functional dependency.

- For \( A \rightarrow BCD \): No extraneous attributes.

- For \( BC \rightarrow DE \): No extraneous attributes.


- For \( B \rightarrow D \): No extraneous attributes.

- For \( D \rightarrow A \): No extraneous attributes.

Step 2: Remove redundant functional dependencies.

- Check if \( A \rightarrow BCD \) can be derived from the others:


- \( B \rightarrow D \) and \( D \rightarrow A \) do not help derive \( A
\rightarrow BCD \).
- Check \( BC \rightarrow DE \):

- Cannot be derived from others.

- Check \( B \rightarrow D \):


- Cannot be derived from others.

- Check \( D \rightarrow A \):

- Cannot be derived from others.

Since no dependencies can be removed, the canonical cover is:

$$
F_c = \{ A \rightarrow BCD, BC \rightarrow DE, B \rightarrow D, D \rightarrow A
\}

$$

4. Give a 3NF decomposition of \( r \) based on the canonical cover

To decompose into 3NF, we create a relation for each functional dependency in


the canonical cover.

1. From \( A \rightarrow BCD \), create \( R_1(A, B, C, D) \).


2. From \( BC \rightarrow DE \), create \( R_2(B, C, D, E) \).

3. From \( B \rightarrow D \), create \( R_3(B, D) \).

4. From \( D \rightarrow A \), create \( R_4(D, A) \).

The 3NF decomposition is:

- \( R_1(A, B, C, D) \)
- \( R_2(B, C, D, E) \)

- \( R_3(B, D) \)
- \( R_4(D, A) \)

5. Give a BCNF decomposition of \( r \) using the original set of functional


dependencies

To decompose into BCNF, we need to ensure that for every functional


dependency \( X \rightarrow Y \), \( X \) is a superkey.
1. Start with the original set \( F \):

- \( A \rightarrow BCD \) (not a superkey, decompose)


- Create \( R_1(A, B, C, D) \) and \( R_2(A, E, F) \).

2. Check \( R_1(A, B, C, D) \):


- \( B \rightarrow D \) (not a superkey, decompose)
- Create \( R_3(B, D) \) and \( R_4(A, C) \).

3. Check \( R_2(A, E, F) \):

- No functional dependencies to check.

The final BCNF decomposition is:

- \( R_1(A, C) \)

- \( R_2(B, D) \)
- \( R_3(A, E, F) \)

This concludes the decomposition into BCNF.

Q.5 Consider the following extension to the tree-locking protocol, which


allows both shared and exclusive locks;
a) A transaction can be either a read-only transaction, in which case
it can request only shared locks, or an update transaction, in
which case it can request only exclusive locks
b)Each transaction must follow the rules of the tree
protocol. Read-only transactions may lock any data item
first, whereas update transactions must lock the root first.
C ) Show that the protocol ensures serializability and
deadlock freedom.
Tree-Locking Protocol Rules:
Read-only Transactions: Can only use shared locks and can lock any data item
in any order.
Update Transactions: Can only use exclusive locks (which prevent others from
accessing data) and must lock the root of the tree first.

1. Ensuring Serializability:
Serializability means that even if multiple transactions run at the same time,
the end result is the same as if they ran one after the other (in some order).
Read-only transactions don’t change data, so they don’t block each other
when they lock data.
Update transactions lock data in a strict order (starting with the root), which
prevents any confusion about which transaction should go first. This ordered
process means their actions can be rearranged to look like they were done one
at a time (serially).
Result: The protocol makes sure transactions are serializable (the final
outcome is the same as if they were done one at a time).

2. Ensuring Deadlock Freedom:


Deadlock happens when two transactions are waiting for each other, and
neither can proceed. This protocol avoids that problem.
Read-only transactions don’t cause deadlock because they just share data
without blocking others.
Update transactions have to lock the data in a specific order (always starting
with the root). This strict order stops transactions from waiting on each other
in a circle, which means no deadlocks.
Result: The protocol prevents deadlocks because of the strict order for locking
data.

Partc
Q1 Explain the various types of joins in SQL, Give suitable examples for each
type.

1. INNER JOIN
Retrieves only the matching rows from both tables based on the join condition.
If there is no match, the row is excluded from the result.
2. LEFT JOIN (or LEFT OUTER JOIN)
Retrieves all rows from the left table, and the matching rows from the right
table.
If there is no match in the right table, NULL is returned for those columns.
3. RIGHT JOIN (or RIGHT OUTER JOIN)
Retrieves all rows from the right table, and the matching rows from the left
table.

If there is no match in the left table, NULL is returned for those columns.

4. FULL JOIN (or FULL OUTER JOIN)

Retrieves all rows when there is a match in either left or right table.

If there is no match on one side, NULL is returned for the missing values.
5. CROSS JOIN
Returns the Cartesian product of both tables.
Every row from the first table is paired with every row from the second table.
6. SELF JOIN
A table is joined with itself.

Useful for comparing rows within the same table.


Q 2 What is Normalization & why we need it?
Normalization is the process of organizing data in a database to reduce
redundancy and improve data integrity. It involves dividing large tables into
smaller, related tables and defining relationships between them.

Why Do We Need Normalization?


Eliminate Data Redundancy

Avoid storing the same data in multiple places.


Ensure Data Consistency

Changes in one place reflect everywhere due to minimized duplication.

Improve Data Integrity

Enforces correct and valid data using constraints and relationships.


Simplify Queries and Maintenance

Structured design makes data easier to manage and update.


Efficient Use of Storage
Saves space by removing repeated data.
Q 3 Take any sample database scheme and write down steps to convert it to
3NF

Sample Database Schema (Unnormalized)

Let’s say we have a table:

Student_Info(StudentID, Name, Course, InstructorName, InstructorPhone)

Step-by-Step Conversion to 3NF

🔹 Step 1: Convert to First Normal Form (1NF)


Rule: Eliminate repeating groups; ensure atomic (indivisible) values.
Action: Make sure each column contains only single-valued, atomic data.
Result:
Still: Student_Info(StudentID, Name, Course, InstructorName, InstructorPhone)
(Assuming all values are atomic)

🔹 Step 2: Convert to Second Normal Form (2NF)


Rule: Must be in 1NF and all non-key attributes must depend on the whole
primary key.
Action: Remove partial dependencies (where attributes depend on part of a
composite key).
Let’s assume StudentID + Course is the composite key.
Now:
Name depends only on StudentID → Partial dependency
InstructorName and InstructorPhone depend only on Course → Partial
dependency
Decompose into:
Student(StudentID, Name)

Enrollment(StudentID, Course)

Course_Info(Course, InstructorName, InstructorPhone)

🔹 Step 3: Convert to Third Normal Form (3NF)

Rule: Must be in 2NF and no transitive dependency (non-key attribute


depending on another non-key attribute).

Action: Remove transitive dependencies.

In Course_Info,
InstructorPhone depends on InstructorName, not directly on Course →
Transitive dependency.

Decompose into:
Course(Course, InstructorName)

Instructor(InstructorName, InstructorPhone)

Final 3NF Tables:

Student(StudentID, Name)
Enrollment(StudentID, Course)

Course(Course, InstructorName)
Instructor(InstructorName, InstructorPhone)

Q 5 Differentiate between conflict and view serializability.

Conflict Serializability
Conflict serializability ensures that a schedule can be transformed into a serial
schedule by swapping non-conflicting operations.
It is based on conflicting operations like read-write, write-read, and write-write
on the same data item.

It can be easily tested using a precedence graph.


If the precedence graph has no cycle, the schedule is conflict serializable.

It is a stricter and more restrictive form of serializability.

View Serializability
View serializability ensures that a schedule is view-equivalent to a serial
schedule.
It considers the final outcome of transactions (i.e., the values read and written)
rather than the order of operations.

It is harder to test and is computationally expensive.


View serializability is a more general and flexible concept than conflict
serializability.

Q 6 How do we test for conflict serializability? Explain with suitable


example.
To test whether a schedule is conflict serializable, we use a method called the
Precedence Graph (also known as Serialization Graph).

🔹 Steps to Test Conflict Serializability

o Create a node for each transaction in the schedule.


o Identify conflicting operations:
Conflicts occur when two operations:
o Belong to different transactions,

o Operate on the same data item,


o And at least one is a write.

o Draw a directed edge from one transaction to another if:


o An operation in the first transaction conflicts with and precedes an
operation in the second.
o Check for cycles in the graph:

o If the graph has no cycle, the schedule is conflict serializable.


o If there is a cycle, the schedule is not conflict serializable.

🔹 Example (Theory Only)

Consider a schedule with two transactions T1 and T2:


T1: Read(A), Write(A)
T2: Read(A), Write(A)
Operations from T1 and T2 conflict because they access the same data item (A),
and at least one is a write.
If T1’s write on A comes before T2’s read/write on A → Draw edge T1 → T2

If T2’s write on A comes before T1’s read/write on A → Draw edge T2 → T1

If both edges are present, a cycle is formed → Not conflict serializable

If only one direction exists and no cycle → Conflict serializable

Q 7 Write a short note on indexing in DBMS.


Indexing in DBMS: Indexing is a data structure technique used in DBMS to
speed up the retrieval of records from a database table. Instead of searching
every row (which is time-consuming), the DBMS uses an index to quickly locate
the data.

🔹 Purpose of Indexing

To improve the performance of query processing.

To reduce disk I/O operations.


To allow faster searching, sorting, and filtering of data.

🔹 Types of Indexes

Primary Index – Built on the primary key of a table.


Secondary Index – Built on non-primary key columns.
Clustered Index – Alters the physical order of data in the table.
Non-clustered Index – Does not change the table's physical order; uses a
separate structure.

🔹 Advantages

Faster data access.


Improves query performance.
Helps in sorting and searching operations.

🔹 Disadvantages

Requires extra storage.


Slows down insert, delete, and update operations due to index maintenance.

Q 8 What are B-trees and why do DBMS use then? Explain with
examples.
What are B-Trees in DBMS: A B-Tree is a self-balanced tree data structure used
in databases and file systems to maintain sorted data and allow searches,
insertions, deletions, and sequential access in logarithmic time.

Why DBMS Use B-Trees


Efficient Disk Access
B-Trees are optimized for systems that read and write large blocks of data, like
disk storage.

Balanced Structure

B-Trees remain balanced, ensuring consistent performance for all operations.


Multi-level Indexing
Suitable for indexing large databases, where data is stored across many pages
or blocks.

Fast Searching
B-Trees reduce the number of disk I/O operations needed to locate a record.

Dynamic Growth
B-Trees can grow or shrink efficiently as data is inserted or deleted.
Key Features of B-Trees

All leaf nodes are at the same level.


Each node contains multiple keys and children.

Nodes split or merge to maintain balance when data is inserted or removed.

Example (Theory)
Suppose a B-Tree of order 3 is used:

Each node can have at most 2 keys and 3 children.


Data is kept sorted, and searching starts at the root, navigating down to the
correct leaf node.
Conclusion: DBMS use B-Trees because they provide balanced, efficient, and
scalable indexing and searching mechanisms, especially suited for large
volumes of data stored on disk.

Q 9 Discuss any five aggregate functions in SQL.


1. COUNT()
Returns the number of rows in a table or the number of non-NULL values in a
column.

Commonly used to count total records.


2. SUM()

Calculates the total sum of a numeric column.

Ignores NULL values while summing.


3. AVG()

Returns the average value of a numeric column.


It also ignores NULL values during calculation.
4. MAX()
Returns the highest (maximum) value in a column.

Can be used with numeric, date, or string data types.


5. MIN()

Returns the lowest (minimum) value in a column.

Useful for finding the smallest number, earliest date, etc.

Q 10 Write a suitable SQL query to illustrate each of them.


1. COUNT()

Counts the number of rows or non-NULL values in a column.

SELECT COUNT(column_name) FROM table_name;


2. SUM()

Calculates the total sum of a numeric column.

SELECT SUM(column_name) FROM table_name;

3. AVG()
Calculates the average value of a numeric column.
SELECT AVG(column_name) FROM table_name;

4. MAX()
Finds the maximum value in a column.

SELECT MAX(column_name) FROM table_name;

5. MIN()
Finds the minimum value in a column.

SELECT MIN(column_name) FROM table_name;

Q1. Draw the diagram of system structure of DBMS. Write down the main
functions of each component. Discuss its types.
... [Q1, Q2, Q3 already included above] ...

Q4. Explain Embedded SQL and its need. How it is different from Dynamic
SQL?
Embedded SQL is a method of combining the computing power of a high-level
programming language (like C, Java) with SQL's powerful database
manipulation capabilities. In embedded SQL, SQL statements are directly
written within the host language code and are preprocessed before
compilation.

Need for Embedded SQL:

1. Allows integration of SQL with procedural languages.

2. Facilitates complex application logic along with database access.


3. Simplifies the development of data-driven applications.
Example:
EXEC SQL SELECT name INTO :empName FROM employee WHERE id = :empID;
Here, :empName and :empID are host variables.
Dynamic SQL: Dynamic SQL allows SQL statements to be constructed and
executed at runtime. It is more flexible as queries can change depending on
program conditions.

Example:
EXECUTE IMMEDIATE 'SELECT * FROM employee WHERE dept = ''' ||
deptName || '''';
Differences:

Feature Embedded SQL Dynamic SQL

Compilation
At compile time At runtime
Time

Flexibility Less flexible Highly flexible

Usage Known, static queries User-defined, dynamic queries

Faster due to pre- Slightly slower due to


Performance
compilation interpretation

Use Cases:
 Embedded SQL is preferred for predefined queries.
 Dynamic SQL is suitable for applications requiring ad-hoc query
generation.

Q5. What is trigger? What is the difference between a trigger and procedure?
A trigger is a set of SQL statements that automatically executes in response to
certain events on a table or view, such as INSERT, UPDATE, or DELETE. Triggers
help enforce business rules, ensure data consistency, and maintain audit trails.

Syntax Example:
CREATE TRIGGER audit_log
AFTER INSERT ON employee
FOR EACH ROW
BEGIN

INSERT INTO audit_table VALUES (:new.empID, CURRENT_TIMESTAMP);


END;

Difference Between Trigger and Procedure:

Feature Trigger Procedure

Execution Automatically on event Manually invoked

Invocation Cannot be called directly Called by user or application

Any time during program


Timing BEFORE/AFTER DML operations
execution

Enforce constraints, log


Use Case Reusable tasks, batch operations
changes

Key Points:
 Triggers are bound to table events.
 Procedures must be explicitly called.

 Triggers can’t take parameters; procedures can.


Triggers are best suited for automatic responses to data changes, while
procedures are useful for reusable routines.

Impotant 300 words

Q1. Draw the diagram of system structure of DBMS. Write down the main
functions of each component. Discuss its types.

Diagram: System Structure of DBMS


+----------------------+
| Users / Applications|
+----------+-----------+

|
v

+----------------------+

| Query Processor |
+----------------------+
| - Parser |

| - Translator |
| - Optimizer |

+----------+-----------+

+----------------------+
| Execution Engine |

+----------+-----------+
|

+----------------------+
| Storage Manager |

+----------------------+

| - Transaction Mgr |

| - Buffer Mgr |

| - File Mgr |
| - Recovery Mgr |
| - Authorization Mgr |
+----------+-----------+

|
v

+----------------------+

| Physical DB |
+----------------------+
Main Components and Functions:

1. Query Processor:
o Translates high-level SQL queries into a form the system can
execute.

o Parser: Checks syntax and builds query tree.


o Translator: Converts parsed query into relational algebra.
o Optimizer: Chooses the most efficient query execution plan.

2. Execution Engine:
o Executes the optimized query plan and communicates with the
storage manager.
3. Storage Manager:

o Manages interaction with the physical database.


o Transaction Manager: Ensures ACID properties.
o Buffer Manager: Manages cache memory.
o File Manager: Handles allocation of space and data structure.
o Recovery Manager: Ensures DB recovery after failure.

o Authorization Manager: Controls access and permissions.

4. Physical Database:
o Actual storage where data is saved on disk.
Types of DBMS:
1. Hierarchical DBMS: Data is organized in a tree-like structure. Example:
IBM IMS.
2. Network DBMS: Data is organized in graph structure, supports many-
to-many relationships.
3. Relational DBMS: Uses tables with rows and columns. Example:
MySQL, Oracle.
4. Object-Oriented DBMS: Supports object storage, inheritance. Example:
db4o, ObjectDB.

Sure! Here’s the 300-word answer for Q2.

Q2. Definitions with Examples


a) Entity:
An entity is a real-world object or concept that can be distinctly identified in a
database. It has attributes describing its properties.
Example: A Student entity with attributes Roll_No, Name, and Age.

b) Composite Attribute and Multi-value Attribute:


 Composite Attribute: Can be divided into smaller sub-parts which
represent more basic attributes.
Example: Name can be divided into First_Name, Middle_Name,
Last_Name.
 Multi-value Attribute: Can hold multiple values for a single entity.
Example: Phone_Numbers for a person can include home, office, and
mobile numbers.
c) Binary vs Ternary Relationship:
 Binary Relationship: Involves two entities.
Example: Student enrolls in Course (Student-Course).
 Ternary Relationship: Involves three entities simultaneously.
Example: Doctor prescribes Medicine to Patient (Doctor-Patient-
Medicine).
d) Super Key, Candidate Key, Primary Key, Foreign Key:
 Super Key: Any attribute or set of attributes that uniquely identify a
tuple.
Example: {Roll_No, Name} is a super key.
 Candidate Key: Minimal super key without redundant attributes.
Example: {Roll_No} is a candidate key.
 Primary Key: A candidate key selected by the database designer to
uniquely identify tuples.
 Foreign Key: An attribute in one table that refers to the primary key in
another table, establishing a relationship.
Example: In the Enrollment table, Student_ID is a foreign key
referencing the Student table.

e) Aggregation, Specialization, Generalization:


 Aggregation: Treats a relationship set as an abstract entity to form
higher-level relationships.
Example: Representing a “works-on” relationship between Employee
and Project as an entity to relate with Department.
 Specialization: Process of creating sub-entities from a higher-level
entity based on some distinguishing characteristics.
Example: Employee specialized into Manager and Engineer.
 Generalization: Reverse of specialization, combining multiple entities
into a generalized higher-level entity.
Example: Combining Manager and Engineer into Employee.

Here’s the detailed 300-word answer for Q3:

Q3. Differences Between File System and Relational Database with Examples
File System:
A file system stores data in files or folders on disk. It is simple and widely
used but lacks sophisticated data management features. Files can be text,
binary, or structured but without enforced relationships or constraints.
 Data Storage: Data stored in flat files like CSV, text files, or
spreadsheets.
 Data Redundancy: High chance of duplicate data since no centralized
control.
 Data Access: Sequential or random file access using file pointers.

 Data Integrity: No built-in integrity constraints or validation.


 Multi-user Access: Limited support, often prone to inconsistencies with
concurrent access.
 Example: A company stores employee records in Excel sheets; payroll
data in another file.
Relational Database System:
A relational database stores data in tables with rows and columns and
supports powerful querying using SQL.

 Data Storage: Data is organized in tables (relations) with schema.


 Data Redundancy: Minimal due to normalization.

 Data Access: Supports complex queries via SQL.


 Data Integrity: Enforced through keys, constraints, and transactions.
 Multi-user Access: Supports multiple users with concurrency control.
 Example: Employee details stored in an Oracle database with tables like
Employee, Department, and Salary connected by keys.

Feature File System Relational Database

Unstructured or semi- Structured (tables with


Structure
structured schema)
Feature File System Relational Database

None (manual or
Query Language SQL
programmatic)

Data
High Low (through normalization)
Redundancy

Data Integrity No Enforced by constraints

Concurrent
Limited Supported with locking
Access

Automatic recovery
Recovery Manual
mechanisms

In summary, file systems are simple but limited; relational DBMS provide
structured, reliable, and efficient data management suitable for complex
applications.

Q4. Explain Embedded SQL and its need. How is it different from Dynamic
SQL?
Embedded SQL is a technique where SQL statements are embedded directly
into a host programming language like C, C++, or Java. It allows programs to
execute SQL commands seamlessly with procedural code. The embedded SQL
statements are pre-compiled by a preprocessor that translates SQL into calls to
the database management system.
Need for Embedded SQL:
 To combine the power of SQL’s data manipulation capabilities with the
programming logic of a host language.
 To allow SQL statements to be written inside application programs,
enabling interaction with databases.
 Provides compile-time syntax checking of SQL statements for error
detection early in development.

Difference between Embedded SQL and Dynamic SQL:

Feature Embedded SQL Dynamic SQL

Static — SQL statements are Dynamic — SQL statements


SQL Statement
fixed and known at compile- are constructed and executed
Handling
time at runtime

Requires preprocessor to No preprocessor needed; uses


Preprocessing
convert embedded SQL APIs like EXECUTE IMMEDIATE

Less flexible for variable SQL More flexible for constructing


Flexibility
queries complex queries at runtime

Syntax errors detected at Errors detected only at


Error Checking
compile-time runtime

Applications requiring user


Applications with fixed
Use Case input for queries or dynamic
queries
queries

In summary, embedded SQL is suitable when SQL queries are known


beforehand and require compile-time checks, while dynamic SQL is used when
queries need to be built or modified at runtime for flexibility.

Q5. What is a trigger? What is the difference between a trigger and


procedure?
A trigger is a special kind of stored procedure that automatically executes (or
"fires") in response to certain events on a particular table or view, such as
INSERT, UPDATE, or DELETE. Triggers are used for enforcing complex business
rules, auditing changes, or maintaining integrity constraints.

Key Features of Triggers:


 Automatically invoked by the DBMS.
 Defined to respond to data modification events.
 Can execute before or after the triggering event.

 Cannot be called explicitly by user applications.


Difference between Trigger and Procedure:

Aspect Trigger Procedure

Automatically by DBMS on Manually called by user or


Invocation
data events application

Enforce integrity, audit, Perform specific operations or


Purpose
enforce rules calculations

Before or after data


Timing Executes when called
modification

Can have input/output


Parameters No explicit parameters
parameters

Control Full control flow with variables,


Limited, usually fixed sequence
Flow loops

Triggers enhance database automation and integrity by reacting to data


changes without manual intervention, while procedures are reusable programs
executed explicitly.

Q6. Give formal definitions and explain with example with respect to:
a) Division Operation:
The division operator ÷ in relational algebra returns tuples from one relation
that are related to all tuples of another relation. For example, find students
who have taken all courses offered.
If R(A, B) and S(B) are relations, then R ÷ S returns all A values from R related to
all B in S.

b) Set Operations:
 Union (∪): Combines tuples from both relations, removing duplicates.
 Intersection (∩): Tuples common in both relations.

 Difference (-): Tuples in one relation but not the other.


c) Selection (σ):
Selects tuples from a relation that satisfy a predicate.
Example: σ_age>20(Student) — students older than 20.
d) Projection (π):
Returns only specified attributes from tuples, removing duplicates.
Example: π_name,age(Student) — list of student names and ages.
e) Natural Join (⨝):
Combines tuples from two relations based on common attribute values.
Example: Student ⨝ Enrollment (join on StudentID).
f) Outer Join:
Extends natural join by including unmatched tuples with NULL values. Types:
Left, Right, Full Outer Join.

Q7. Define normalization. Explain 1NF, 2NF, 3NF and BCNF with example.
Normalization is the process of organizing data in a database to reduce
redundancy and improve data integrity by dividing tables and defining
relationships.
 1NF (First Normal Form):
Each attribute contains atomic values; no repeating groups.
Example: Table with multi-valued phone numbers violates 1NF; split into
separate rows.
 2NF (Second Normal Form):
In 1NF and every non-key attribute is fully functionally dependent on the
primary key (no partial dependency).
Example: If a table with composite key (A, B) has attribute depending
only on A, it violates 2NF.
 3NF (Third Normal Form):
In 2NF and no transitive dependency exists, i.e., non-key attributes
depend only on the key.
Example: If attribute C depends on B, which depends on key A, it violates
3NF.
 BCNF (Boyce-Codd Normal Form):
Stronger than 3NF; every determinant is a candidate key.
Handles anomalies not covered by 3NF.

Q8. Define functional dependencies and explain with example:


i) Armstrong’s Axioms:
Set of inference rules to derive all functional dependencies:
 Reflexivity: If Y ⊆ X, then X → Y

 Augmentation: If X → Y, then XZ → YZ
 Transitivity: If X → Y and Y → Z, then X → Z
ii) Closure of a set of functional dependencies:
All functional dependencies that can be inferred from a given set using
Armstrong’s axioms.

iii) Insert, Update, Delete anomalies:


 Insert anomaly: Cannot insert data without other data being present.

 Update anomaly: Changing data in multiple places leads to inconsistency.


 Delete anomaly: Deleting data unintentionally removes other necessary
data.

Q9. Decompose the schema R= (A, B, C, D, E) into (A, B, C) and (A, D, E). Also
show that this decomposition is lossless-join decomposition if the following
FDs hold: A→BC, CD→E, B→D, E→A.

Given:
 R = (A, B, C, D, E)
 Decomposition: R1 = (A, B, C), R2 = (A, D, E)

 F = {A→BC, CD→E, B→D, E→A}


To check lossless join:
The intersection R1 ∩ R2 = {A}

If either R1 ∩ R2 → R1 or R1 ∩ R2 → R2 holds, decomposition is lossless.

Here:
 A → BC (from F) so A → R1
 A → DE can be inferred from B→D, E→A, and other FDs?
Check closure of A: A+ = {A, B, C} (from A→BC), and B→D implies A→D
via augmentation. Also E→A implies E+ includes A, so from CD→E and
B→D, etc.
Hence A→R2 also holds.

Since A → R1 and A → R2, decomposition is lossless.

Q10. Given schema S = {A, B, C, D, E} with functional dependencies F = {A→B,


BC→E, ED→A}
(a) Is S in BCNF? Why?
A relation is in Boyce-Codd Normal Form (BCNF) if for every functional
dependency X → Y, X is a superkey (i.e., X⁺ includes all attributes of S).
Check each FD:
 A → B: Compute A⁺ = {A, B}. It does not contain all attributes (C, D, E
missing), so A is not a superkey.

 BC → E: BC⁺ = {B, C, E}. Missing A, D, so BC is not a superkey.


 ED → A: ED⁺ includes E, D, and A (from ED→A), then A→B implies B also
included. So ED⁺ = {A, B, D, E}, missing C. ED is not a superkey.
Since no FD has a superkey determinant, S is not in BCNF.
(b) Is S in 3NF? Why?
3NF requires for every FD X → Y, either X is a superkey or Y is a prime attribute
(part of some candidate key).
To verify this, find candidate keys (see part d). If all RHS attributes are prime, S
may still be in 3NF.

(c) Canonical Cover Fc of F


The canonical cover minimizes FDs by removing extraneous attributes and
decomposing RHS.
 Check if any attribute in BC → E is extraneous: Neither B nor C is
extraneous since removing either breaks dependency.
 The FDs are minimal; hence, Fc = {A→B, BC→E, ED→A}.

(d) Candidate Keys for S


Find minimal attribute sets with closure covering all attributes:
 ED⁺ = {A, B, D, E} missing C.
 BCD⁺ = {B, C, D, E, A} because BC→E, ED→A.

 CDE⁺ also covers all attributes similarly.

Candidate keys are {B, C, D} and {C, D, E}.


Since all RHS attributes in F are prime, S is in 3NF but not BCNF.

Q11. Closed Hashing vs Open Hashing in Database Applications


Closed Hashing (Open Addressing):
All records are stored inside the hash table array itself. On collision, the system
probes for the next empty slot using methods like linear probing, quadratic
probing, or double hashing.
 Advantages: Simple structure, no need for pointers, good cache
performance.
 Disadvantages: Performance drops as load factor approaches 1 due to
clustering and more probes. Deletion is complex.

Open Hashing (Separate Chaining):


Each slot in the hash table contains a linked list (or bucket) of entries that hash
to the same index. Collisions are resolved by inserting elements into these
chains.
 Advantages: Handles load factors > 1 gracefully, simple insertions and
deletions.
 Disadvantages: Extra memory overhead for pointers, possibly slower
due to pointer chasing.

Relative Merits in Databases:


 Open hashing is preferred when the dataset size is dynamic and can
grow unpredictably since it avoids clustering and performs well with high
load factors.
 Closed hashing suits fixed-size datasets where memory is tight and the
load factor is kept low to maintain performance.
 Open hashing also better supports concurrent accesses with less
clustering, making it suitable for multi-user database environments.

Let me know if you want me to proceed with more answers!

Q12. Difference Between Clustering Index and Secondary Index


A Clustering Index is an index on a data file where the ordering of the data
records is the same as, or close to, the ordering of the index keys. There is
typically only one clustering index per table because the data can be physically
sorted on only one attribute (or set of attributes). The clustering index is built
on the primary key or another attribute that determines the physical order of
records on disk. This index is especially useful for range queries because related
data is stored contiguously, reducing disk I/O.
Example: If a table is physically sorted by EmployeeID, then an index on
EmployeeID is a clustering index.

A Secondary Index (also called a non-clustering index) is an index built on an


attribute that is not the ordering attribute of the data file. The data is not
stored in the order of the secondary index keys. Multiple secondary indexes
can exist on a table. The secondary index contains pointers to the actual
records in the data file. Secondary indexes speed up queries on columns that
are not keys.
Example: An index on EmployeeName when the table is sorted by EmployeeID
is a secondary index.

Summary:

Feature Clustering Index Secondary Index

Data not physically


Data ordering Data physically ordered on key
ordered

Number per
One Multiple
table

Range queries, sequential


Suitable for Random access queries
access

Storage overhead Less (no extra pointers) More (pointers to records)

Q13. Serializability: Conflict and View Serializability


Serializability is the correctness criterion for concurrent transactions. It
ensures the concurrent execution of transactions yields the same results as
some serial (one-after-another) execution, maintaining database consistency.

Conflict Serializability is based on conflicting operations—read/write or


write/write on the same data item. Two operations conflict if they are from
different transactions and at least one is a write. A schedule is conflict-
serializable if it can be transformed into a serial schedule by swapping non-
conflicting operations.
Example:
Schedule S: T1 reads A, T2 writes A → conflict exists. If schedule can reorder
operations without changing conflicts to a serial order, it is conflict-serializable.

View Serializability considers the overall effect on data values and final
reads/writes. Two schedules are view-equivalent if:

1. They read the same initial values.


2. They produce the same final writes.

3. They read values written by the same transactions.


View serializability is more general and allows some schedules not conflict-
serializable to be correct, but it’s harder to check.

Q14. Lock and Concurrency Control Protocols


A Lock in DBMS is a mechanism to control concurrent access to data objects.
Locks ensure transactions do not interfere destructively.

Lock-Based Protocol:
Transactions acquire locks (shared for reading, exclusive for writing). Locks
serialize conflicting operations. Strict two-phase locking (2PL) ensures
serializability and avoids some anomalies by acquiring all locks before releasing
any.

Timestamp-Based Protocol:
Each transaction gets a unique timestamp. Operations are ordered by
timestamp. Transactions with older timestamps have priority; conflicting
operations may abort younger transactions. No locks are used.

Validation-Based Protocol:
Transactions execute without restrictions but are validated at commit time.
Validation checks if the transaction conflicts with others. If conflict detected,
transaction aborts. It avoids lock overhead but can lead to aborts.

Q15. Log-Based Recovery Schemes


(a) Deferred Database Modification:
In this approach, database changes are not written until the transaction
commits. Logs record all updates. If a crash occurs before commit, the database
state remains unchanged. Recovery involves redo of committed transactions
only.

(b) Immediate Database Modification:


Changes are applied to the database immediately during transaction execution,
but logs keep track of old and new values. If a crash occurs, uncommitted
changes are undone using the log. Recovery requires undoing incomplete
transactions and redoing committed ones.

Part-B
Certainly! Here are detailed 300-word answers for Part-B DBMS questions 1 to
5:
Q1. Relational Algebra Expression for Sum of Salaries of All Employees
Given two relations:
Employee(Name, salary, dept_no)
Department(dept_no, deptname, address)
Relational algebra is a procedural query language used to manipulate and
retrieve data from relations. However, classical relational algebra does not have
built-in aggregate functions like SUM. To compute the sum of salaries, an
extended relational algebra with aggregation is needed.
To find the total sum of salaries of all employees:
γ_SUM(salary)(Employee)

Where:
 γ is the grouping and aggregation operator
 SUM(salary) computes the sum of the salary attribute from all tuples in
Employee
If aggregation by department is needed, the expression becomes:
γ_dept_no; SUM(salary)(Employee)
This groups employees by their department number and computes total salary
per department.
Without aggregation, pure relational algebra cannot directly express sum
operations, but practical DBMSs support aggregate functions in SQL and
extended algebra.
This expression helps managers understand payroll expenses overall or by
department.

Q2. SQL Query to Count Employees in Each Department (Using dept_no)


SELECT dept_no, COUNT(*) AS employee_count

FROM Employee

GROUP BY dept_no;
Explanation:
 The GROUP BY dept_no clause groups employees by their department.

 COUNT(*) counts the number of employees in each group.


 The result shows each department's number along with the count of
employees in it.
This query is useful for analyzing department sizes, resource allocation, and
workforce distribution in an organization.

Q3. SQL Query to Get Employee Name with Highest Salary in Each
Department
SELECT E1.Name, E1.dept_no, E1.salary
FROM Employee E1

WHERE E1.salary = (

SELECT MAX(E2.salary)
FROM Employee E2

WHERE E2.dept_no = E1.dept_no


);

Explanation:

 The inner query finds the maximum salary in each department.


 The outer query selects employees whose salary matches the maximum
salary of their department.

 This handles ties where multiple employees have the highest salary.

This query is commonly used for performance review or bonus allocation.

Q4. Partial Functional Dependency with Example


A partial functional dependency occurs in a relation where a non-prime
attribute depends on only part of a composite primary key, violating Second
Normal Form (2NF).
Example:
Consider relation R(A, B, C) with composite primary key (A, B). If attribute C
depends only on A (A → C), this is a partial dependency.
Problem: If B varies but A stays constant, C repeats unnecessarily, causing
redundancy and update anomalies.
Why it matters: Partial dependencies lead to insertion, deletion, and update
anomalies, degrading data integrity.
Normalization removes these dependencies by decomposing relations into
smaller ones where attributes depend fully on the key.

Q5. Concept of Serializability with Suitable Example


Serializability ensures that concurrent transaction execution results in a
database state equivalent to some serial (one-after-another) execution,
preserving consistency.
Example:
Transactions:

 T1 transfers money from A to B.

 T2 checks balance of A.
If T2 reads A’s balance before T1 commits, it may see inconsistent data.

Two types:
 Conflict Serializability: Based on conflicting operations (read-write,
write-read).
 View Serializability: Based on data read and final outputs.
Serializability is essential to avoid concurrency anomalies and ensure
correctness in multi-user environments.
Q6. Why Is a Transaction Required to Be Atomic? Explain Using Suitable
Example
Atomicity is a fundamental property of transactions in DBMS, meaning a
transaction is an indivisible unit of work that either completes entirely or does
not happen at all. This property ensures data integrity even in cases of system
failures, power outages, or errors during execution.
When a transaction is executed, it may consist of multiple operations like
reading, writing, updating data. Atomicity guarantees that if any operation
within the transaction fails, all the previously performed operations are undone
(rolled back) to maintain the database’s consistency.
Example: Consider a banking application where a transaction transfers $500
from Account A to Account B. The steps include:

1. Deduct $500 from Account A’s balance.

2. Add $500 to Account B’s balance.


If the system crashes after step 1 but before step 2, without atomicity, the $500
would be lost, causing inconsistency. Atomicity ensures that if the transaction
cannot complete both steps, it will rollback the deduction from Account A,
preserving the original balance and avoiding data corruption.
Atomicity is enforced using transaction logs and recovery mechanisms, which
record every step so that incomplete transactions can be undone automatically
during recovery.
In summary, atomicity protects the database from partial or incomplete
transactions, preserving correctness and reliability. Without atomicity,
databases would be vulnerable to inconsistent states, leading to data loss or
corruption, especially in multi-user, concurrent environments.

Q7. What is the Purpose of Creating Views in DBMS?


A view is a virtual table in DBMS created by a query over one or more base
tables. It presents data from these tables in a specific way without physically
storing the data separately.
The purpose of views includes:
1. Simplifying Complex Queries: Views hide the complexity of SQL joins,
aggregations, or calculations by encapsulating them, making it easier for
users to retrieve data without writing complex queries.
2. Data Security and Access Control: Views can restrict access to sensitive
data by exposing only specific columns or rows of a base table. For
example, a salary column can be excluded from the view so that users
cannot see it.
3. Logical Data Independence: Views provide a level of abstraction, so
changes in base tables do not affect user applications as long as the view
definition remains unchanged.
4. Customized Presentation: Different users or applications may require
different views of the data; views can provide these customized data
presentations without duplicating data.
5. Data Aggregation and Summarization: Views can be used to present
summarized or aggregated data like total sales, average salary, etc.,
without physically storing these results.
Since views do not store data physically, they always reflect the current state of
underlying base tables, ensuring up-to-date information without redundancy.
Overall, views enhance data abstraction, security, and ease of use in a DBMS
environment.

Q8. Explain DBMS Architecture and Its Types


DBMS architecture describes how data is stored, accessed, and managed in a
database system. The most common architecture is the Three-Level
Architecture, which supports data abstraction and independence:
1. Internal Level: The lowest level, it describes how data is physically stored
in the database, including file structures, indexes, and storage methods.
It deals with efficiency and optimization but is hidden from users.
2. Conceptual Level: The middle level defines the logical structure of the
entire database, including all entities, attributes, and relationships. It
hides the physical details and focuses on what data is stored and its
constraints.
3. External Level: The highest level, it consists of multiple user views
customized for different user needs. Each user or application sees only a
subset of the database relevant to them.
This layered architecture ensures data independence, meaning changes in
physical storage or conceptual design do not affect user views.
Types of DBMS Architectures:
 Centralized Architecture: All data and DBMS software reside at a single
site. Suitable for small systems but limited in scalability and reliability.
 Client-Server Architecture: The DBMS runs on a server; clients interact
with it remotely. This supports multiple users and better performance.
 Distributed Architecture: Data is distributed across multiple sites
connected by a network. This improves reliability and allows parallel
processing but introduces complexity.
 Parallel Architecture: Multiple processors access a shared database to
enhance performance and reliability, used in large-scale systems.
Each architecture is chosen based on system needs such as scalability, fault
tolerance, response time, and cost.

Q9. Explain Mapping Cardinalities Between Two Entities with Real-Life


Example
Mapping cardinality in an ER model defines the number of instances of one
entity that can or must be associated with instances of another entity. It
determines the nature of relationships between entities.

There are three common types:


1. One-to-One (1:1): Each instance of entity A corresponds to exactly one
instance of entity B, and vice versa.
Example: Each person has one passport, and each passport is assigned to
one person.
2. One-to-Many (1:N): One instance of entity A is related to many instances
of entity B, but each instance of B is related to only one instance of A.
Example: A department has many employees, but each employee
belongs to only one department.
3. Many-to-Many (M:N): Instances of entity A can relate to multiple
instances of entity B, and vice versa.
Example: Students enroll in many courses, and each course has many
students.
Mapping cardinalities affect database design, helping determine how foreign
keys are assigned and how relationships are implemented (via join tables for
many-to-many).
Understanding these helps to maintain referential integrity and accurately
model real-world constraints.

Q10. Explain Candidate Key, Super Key, Foreign Key, Alternate Key, Composite
Key, and Artificial Key with Examples
 Candidate Key: A minimal set of attributes that uniquely identify tuples
in a relation. There can be multiple candidate keys.
Example: In Employee(Name, EmpID, Email), EmpID and Email can both
be candidate keys.
 Super Key: Any set of attributes that uniquely identifies a tuple, not
necessarily minimal.
Example: (EmpID, Name) is a super key because EmpID alone is a
candidate key.
 Primary Key: The candidate key chosen by the database designer to
uniquely identify tuples.
Example: EmpID chosen as the primary key.
 Foreign Key: An attribute (or set) in one relation that refers to the
primary key in another relation, used to maintain referential integrity.
Example: Dept_no in Employee references Dept_no in Department.
 Alternate Key: Candidate keys not selected as primary key.
Example: Email, if not chosen as primary key, is an alternate key.
 Composite Key: A key composed of two or more attributes to uniquely
identify a tuple.
Example: In Enrollment(StudentID, CourseID), both together uniquely
identify a record.
 Artificial (Surrogate) Key: A system-generated key with no real-world
meaning, used for simplicity or performance.
Example: Auto-incremented ID numbers.
Each key type serves specific roles in ensuring uniqueness, establishing
relationships, and maintaining integrity within the database.

Part-C

Q1. What is Normalization? Explain All Types of Normal Forms with Example
Normalization is a fundamental database design process that organizes data to
reduce redundancy and dependency by dividing large tables into smaller
related tables. It improves data integrity and minimizes anomalies during data
operations like insert, update, or delete.

Types of Normal Forms:


1. First Normal Form (1NF):
A relation is in 1NF if all its attributes contain atomic values (no repeating
groups or arrays). For example, a table storing student courses should list
each course in a separate row rather than multiple courses in one field.
2. Second Normal Form (2NF):
A relation is in 2NF if it is in 1NF and every non-key attribute is fully
functionally dependent on the entire primary key, not just a part of it.
For example, if a table has a composite key (StudentID, CourseID),
attributes like Grade should depend on both keys, not just StudentID.
3. Third Normal Form (3NF):
A relation is in 3NF if it is in 2NF and all the non-key attributes are not
transitively dependent on the primary key. For example, if StudentID →
DeptID and DeptID → DeptName, then DeptName transitively depends
on StudentID and should be moved to a separate table to remove
redundancy.
4. Boyce-Codd Normal Form (BCNF):
A stronger version of 3NF, BCNF requires that every determinant is a
candidate key. It addresses anomalies that may still exist in 3NF. For
example, in a relation where an attribute functionally determines part of
a candidate key, BCNF decomposition is necessary to eliminate
redundancy.
Normalization is essential for designing a well-structured database that
maintains consistency, reduces duplication, and supports efficient queries and
updates. However, excessive normalization can lead to complex queries, so
practical designs balance normalization with performance.

Q2. What is Relational Decomposition? Explain Types of Decomposition with


Examples
Relational decomposition is the process of splitting a relation into two or more
relations to improve database design by removing redundancy and anomalies.
The main goals are to achieve lossless join and dependency preservation.

Types of Decomposition:
1. Lossless Decomposition:
A decomposition is lossless if the original relation can be perfectly
reconstructed by joining the decomposed relations. This prevents loss of
information.
Example:
Relation R(A, B, C) decomposed into R1(A, B) and R2(A, C) is lossless if
attribute A is a key. The join of R1 and R2 on A returns the original
relation without data loss.
2. Dependency Preserving Decomposition:
This ensures that all functional dependencies in the original relation can
be enforced in the decomposed relations without needing to join them.
This makes enforcement of constraints efficient.
Example:
If the original set of dependencies F can be split so that each FD is in one
of the decomposed relations, then the decomposition preserves
dependencies.
Decomposition helps eliminate redundancy, avoid anomalies, and make
databases easier to maintain. However, sometimes trade-offs exist between
lossless join and dependency preservation, so database designers must balance
these factors.

Q3. (a) State Diagram of Transaction with Explanation


A transaction’s lifecycle in a DBMS is represented by a state diagram showing
the possible states and transitions:
 Active: The transaction is executing its operations.
 Partially Committed: The transaction has finished all operations but
changes are not yet permanent.
 Committed: The transaction has successfully completed, and all changes
are permanently saved to the database.

 Failed: The transaction encounters an error, halting its progress.


 Aborted: The transaction is rolled back; any changes made are undone
to maintain database consistency.
This model guarantees atomicity and durability properties of transactions,
ensuring the database remains consistent even in the event of failure.
(b) Schedule and Types
A schedule is an interleaving of operations from multiple transactions,
preserving individual transaction order but mixing operations to allow
concurrency.
 Serial Schedule: Transactions execute one after the other, no
interleaving.
 Concurrent Schedule: Operations from multiple transactions interleave,
allowing better resource use.
 Conflict Serializability: A schedule is conflict serializable if it can be
transformed into a serial schedule by swapping non-conflicting
operations.
 View Serializability: A more general condition ensuring the schedule
produces the same final data and reads as a serial schedule.

Q4. (a) Difference Between Irrecoverable and Recoverable Schedules with


Cascading Rollback Example
 Recoverable Schedule: A schedule where a transaction commits only
after all transactions whose data it read have committed. This avoids
inconsistency from committing on uncommitted data.
 Irrecoverable Schedule: One where a transaction commits even though
it read uncommitted data from another transaction that might abort,
leading to inconsistencies.
Cascading Rollback happens when aborting one transaction forces other
dependent transactions to abort as well.
Example: If T2 reads data written by T1, and T1 aborts, then T2 must also abort
to maintain consistency, causing a cascade of rollbacks.
(b) Difference Between ODBC and JDBC
 ODBC (Open Database Connectivity): A language-independent API for
accessing database management systems, often used in C/C++ programs
and supports many databases via drivers.
 JDBC (Java Database Connectivity): A Java-specific API designed for Java
applications to interact with databases. It supports dynamic SQL,
prepared statements, and is platform-independent within Java
environments.

Q5. (a) Conflict vs View Serializability


 Conflict Serializability: Ensures that the schedule can be rearranged into
a serial order by swapping non-conflicting operations. It depends on
direct conflicts between read/write operations.
 View Serializability: A broader concept, requiring the schedule to
produce the same final results and reads as a serial schedule, regardless
of conflicts.
Conflict serializability is easier to check and is a subset of view serializability.

(b) Embedded SQL vs Dynamic SQL


 Embedded SQL: SQL commands are hard-coded in the host program (like
C or Java), compiled and executed as part of the application. Best for
fixed queries.
 Dynamic SQL: SQL commands are constructed and executed at runtime,
allowing applications to build flexible queries based on user input or
other runtime conditions.

Dynamic SQL provides more flexibility but can be more complex to manage.

Part-B questions:

Q1. (a) Explain any four Codd’s Rules/Laws of RDBMS.


Edgar F. Codd proposed 12 rules defining a true relational database system.
Here are four key rules:
1. Information Rule:
All data in the database must be represented in tables (relations) with
rows and columns, ensuring uniformity and logical representation.
2. Guaranteed Access Rule:
Every individual data item should be accessible by specifying the table
name, primary key, and column name. This provides straightforward,
direct access to data without ambiguity.
3. Null Values Rule:
The system must support null values to represent missing or unknown
information distinctly from zero or empty strings, allowing for more
realistic modeling of incomplete data.
4. Dynamic Online Catalog:
The database catalog (metadata) must be stored as tables and
accessible via the same query language used for data retrieval and
manipulation, enabling self-describing databases.
These rules help ensure data integrity, ease of access, and consistency in
relational database systems.

(b) Explain GRANT and REVOKE command with syntax and example.
 GRANT: Used to give privileges to users on database objects such as
tables, views, etc. It controls who can perform operations like SELECT,
INSERT, UPDATE, DELETE.
Syntax:
GRANT privilege ON object TO user;
Example:
GRANT SELECT, INSERT ON Employee TO user1;
This allows user1 to read and insert records into the Employee table.
 REVOKE: Used to withdraw previously granted privileges.
Syntax:
REVOKE privilege ON object FROM user;
Example:
REVOKE INSERT ON Employee FROM user1;
This removes insert permission from user1 on the Employee table.
Together, these commands enforce security by controlling user permissions in
the database.

Q2. (a) Explain aggregate functions with syntax and example.


Aggregate functions perform calculations on sets of rows, returning a single
summarized value. Common aggregate functions include:
 SUM(column): Adds all numeric values in a column.
 COUNT(column): Counts the number of non-null values.

 AVG(column): Computes the average of numeric values.

 MAX(column): Finds the maximum value.


 MIN(column): Finds the minimum value.
Syntax:
SELECT aggregate_function(column) FROM table WHERE condition;

Example:

SELECT SUM(salary) FROM Employee WHERE deptno = 10;


This query sums all salaries of employees working in department 10.
Aggregate functions are useful for generating reports and summaries from
large datasets.

(b) Write and explain syntax for creating and dropping synonyms with
example.
 CREATE SYNONYM: Creates an alias for a database object, allowing
simpler or alternative naming.
Syntax:
CREATE SYNONYM synonym_name FOR object_name;
Example:
CREATE SYNONYM emp_syn FOR Employee;
Here, emp_syn is a synonym for the Employee table.
 DROP SYNONYM: Removes the synonym from the database.
Syntax:
DROP SYNONYM synonym_name;
Example:
DROP SYNONYM emp_syn;
Synonyms simplify SQL commands by providing short or more meaningful
object references, especially useful in large or complex databases.

Q3. (a) ER Diagram for Hospital Management System


Entities and their attributes:

 DOCTOR: Doctor_ID (PK), Name, Specialty

 PATIENT: Patient_ID (PK), Name, DOB

 HOSPITAL: Hospital_ID (PK), Name, Address


 MEDICAL_RECORD: Record_ID (PK), Patient_ID (FK), Doctor_ID (FK),
Diagnosis, Treatment
Relationships:
 Doctor works at Hospital (Many Doctors to One Hospital)

 Patient admitted to Hospital (Many Patients to One Hospital)


 Medical_Record links Patient and Doctor (Each record references one
Patient and one Doctor)
Primary Keys uniquely identify each entity. Foreign Keys (Patient_ID and
Doctor_ID in Medical_Record) reference primary keys, establishing
relationships and ensuring referential integrity.
(b) SQL Queries for EMP table:

i) Employee names and numbers ordered by salary ascending:


SELECT ename, empno FROM EMP ORDER BY salary ASC;

ii) Employee name and number grouped by department:

SELECT deptno, ename, empno FROM EMP ORDER BY deptno;


iii) Total salary of all employees:
SELECT SUM(salary) FROM EMP;

iv) Number of employees department-wise:


SELECT deptno, COUNT(*) FROM EMP GROUP BY deptno;

v) Employees with experience more than 3 years (based on joining date):


SELECT ename FROM EMP WHERE joiningdate <= ADD_MONTHS(SYSDATE, -
36);
vi) Employee names starting with 'S' working in deptno 1002:

SELECT ename FROM EMP WHERE ename LIKE 'S%' AND deptno = 1002;

Q4. (a) Explain ALTER command with two options.


The ALTER command modifies the structure of existing database objects such
as tables.
Two common uses:
1. Adding a column:
ALTER TABLE EMP ADD (email VARCHAR2(50));
Adds an email column to the EMP table.

2. Modifying a column’s data type:

ALTER TABLE EMP MODIFY (salary NUMBER(10,2));


Changes the salary column’s datatype and precision.
This command is essential for evolving database schemas without dropping
tables.

(b) Explain multivalued dependencies with example.


A multivalued dependency (MVD) exists when one attribute determines
multiple independent sets of attributes.
Example:
In relation STUDENT(Course, Hobby), if a student can have multiple courses
and multiple hobbies independently, the dependency Course →→ Hobby
holds. That is, for each course, multiple hobbies exist independently.
MVDs cause redundancy, leading to update anomalies, and are resolved by
decomposing tables into Fourth Normal Form (4NF).

Q5. (a) Define Normalization and state three advantages.


Normalization is the process of organizing data to minimize redundancy and
avoid anomalies by decomposing tables according to normal forms.

Advantages:
1. Eliminates Data Anomalies: Insert, update, and delete anomalies are
minimized.
2. Improves Data Integrity: Ensures consistency by removing redundant
data.
3. Efficient Storage: Reduces data duplication, saving storage space.

(b) Difference between Conflict and View Serializability. Testing for


Serializability.
 Conflict Serializability: Two schedules are conflict serializable if they
can be transformed into a serial schedule by swapping non-conflicting
operations (read-write or write-write conflicts). It is easier to test.
 View Serializability: Two schedules are view serializable if they produce
the same read-from and final write results as some serial schedule. It is
more general but complex to verify.
Testing:
Construct a precedence (conflict) graph with transactions as nodes and edges
representing conflicts. If the graph is acyclic, the schedule is conflict
serializable.
Certainly! Here are fully detailed 350-word answers for each of your Part-C
questions:

Q1. What are views? Explain how views are different from tables.
Views are virtual tables in a database that are defined by a SQL query. Unlike
regular tables, views do not store data physically; they dynamically display
data retrieved from one or more underlying base tables whenever accessed.
Views provide a way to present specific data subsets, filter rows, combine
columns from multiple tables, or hide complexity from users.
Key Characteristics of Views:

 They simplify complex queries by encapsulating them as virtual tables.


 They enhance security by restricting user access to sensitive data (e.g.,
showing only selected columns).
 They provide data abstraction and logical independence from the
physical storage.

Differences between Views and Tables:


1. Storage: Tables store data physically; views do not store data but fetch
data dynamically based on the view’s SELECT statement.
2. Updateability: Tables can be updated directly. Views may be updatable
only if they are simple enough (e.g., no aggregations or joins);
otherwise, they are read-only.
3. Security: Views can limit user access by exposing only specific rows or
columns, acting as a security mechanism.
4. Performance: Since views execute their underlying queries every time
they are accessed, complex views may cause slower query performance
compared to querying tables directly.
5. Definition: Tables are defined by their schema and hold the data; views
are defined by queries stored in the metadata.
In practice, views help in managing user roles, hiding complexity, and
simplifying application development. They allow multiple users to see
customized versions of the same data without replicating it. However, since
views don’t physically store data, changes in base tables immediately reflect
in views.
In summary, views are virtual tables for convenience, security, and
abstraction, while tables are the core storage units in databases.

Q2. (a) What are the disadvantages of file processing systems?


File processing systems were early methods of data management where data
was stored in flat files without any sophisticated management system. They
have several limitations:
 Data Redundancy: The same data is often duplicated across multiple
files, wasting storage and complicating updates.
 Data Inconsistency: Due to redundancy, updates in one file may not
reflect in others, causing inconsistent data.
 Lack of Data Independence: Any change in file structure requires
modifying all application programs accessing the file.
 Difficulty in Accessing Data: Complex queries involving multiple files
are difficult and require specialized programming.
 Data Isolation: Data is scattered across files making it difficult to
retrieve related information.
 Concurrent Access Problems: File systems lack proper mechanisms for
concurrent data access, leading to potential conflicts or corruption.
 Poor Security: No fine-grained access control mechanisms exist.
 No Recovery Mechanisms: In case of failures, recovery is complicated
and manual.
Because of these disadvantages, file systems are largely replaced by Database
Management Systems (DBMS), which solve these issues through structured
data storage, query languages, concurrency control, and recovery features.

(b) What is data independence? Explain the difference between physical and
logical data independence.
Data Independence is the ability to change the schema at one level of a
database system without affecting the schema or applications at higher
levels. It is crucial for database evolution and maintenance.
 Physical Data Independence:
Refers to the capacity to change physical storage details (e.g., indexing
methods, file organization) without affecting the conceptual schema or
user applications. For example, switching from one storage device to
another or reorganizing files does not require changes in applications.
 Logical Data Independence:
Refers to the ability to change the conceptual schema without altering
external schemas or application programs. For example, adding new
fields or tables, changing relationships, or deleting attributes at the
conceptual level can be done without modifying applications.
Logical data independence is more difficult to achieve than physical because
applications often depend on the logical structure of data. Both forms of
independence improve database flexibility and reduce maintenance costs.

(c) What is a relationship and what are their different types?


In a database, a relationship defines how entities are associated with one
another. It links entity sets to represent real-world associations.

Types of Relationships:
1. One-to-One (1:1):
Each entity in set A is related to at most one entity in set B, and vice
versa.
Example: Each person has one unique passport.
2. One-to-Many (1:N):
An entity in set A can be associated with multiple entities in set B, but
each entity in B relates to only one entity in A.
Example: One department has many employees.
3. Many-to-Many (M:N):
Entities in set A can relate to multiple entities in set B and vice versa.
Example: Students enroll in many courses; courses have many
students.
4. Unary Relationship:
An entity relates to itself.
Example: An employee supervises other employees.
Relationships help structure data and define constraints, essential for
meaningful database design.

Q3. (a) How candidate key is different from super key?


A super key is any combination of one or more attributes that uniquely
identifies a tuple in a relation. It can contain extra attributes that are not
necessary for uniqueness.
A candidate key is a minimal super key, meaning no attribute can be removed
without losing the uniqueness property. Candidate keys are potential primary
keys.
Example:
Consider a STUDENT relation with attributes {StudentID, Name, Email}.
 {StudentID, Email} is a super key because both together uniquely
identify a student.
 {StudentID} alone is a candidate key if it uniquely identifies a student.
 {StudentID} is minimal because removing any attribute would make it
non-unique.
Thus, every candidate key is a super key, but not every super key is a
candidate key.

(b) Write short notes on:


(i) Data Manipulation Language (DML):
DML is a subset of SQL used to query and manipulate data in databases. It
includes commands like SELECT (to retrieve data), INSERT (to add new data),
UPDATE (to modify existing data), and DELETE (to remove data). DML enables
users to perform CRUD operations efficiently.
(ii) Derived Attribute:
A derived attribute is one whose value is calculated from other attributes, not
stored physically. For example, Age can be derived from the Date of Birth
(DOB). Derived attributes reduce redundancy and are computed as needed.

Q4. SQL Queries for Relations

Given:

 S(S#, SNAME, STATUS, CITY)


 SP(S#, P#, QTY)

 P(P#, PNAME, COLOR, WEIGHT, CITY)


(i) Get supplier names who supply at least one red part:

SELECT DISTINCT S.SNAME

FROM S, SP, P
WHERE S.S# = SP.S# AND SP.P# = P.P# AND P.COLOR = 'Red';
(ii) Get supplier names who do not supply part P2:

SELECT SNAME
FROM S
WHERE S# NOT IN (
SELECT S#

FROM SP
WHERE P# = 'P2'

);

Q5. Discuss two-phase locking protocol. Variations with merits and demerits.
Two-Phase Locking (2PL) is a concurrency control method ensuring
serializability of transactions by dividing execution into two distinct phases:
1. Growing Phase: Transactions acquire all required locks but do not
release any.
2. Shrinking Phase: Transactions release locks and cannot acquire new
locks.
This protocol prevents conflicting operations from interleaving, ensuring
correct execution order.

Variations:
 Strict 2PL: All locks are held until the transaction commits or aborts.
This avoids cascading rollbacks and guarantees recoverability but
reduces concurrency.
 Rigorous 2PL: Similar to Strict 2PL but holds both shared and exclusive
locks until commit, simplifying recovery further but further limiting
concurrency.
 Basic 2PL: Locks can be released anytime after the growing phase but
before commit, allowing higher concurrency but risking cascading
rollbacks.

Merits:

 Ensures conflict serializability.


 Provides a balance between consistency and concurrency.
Demerits:
 Can cause deadlocks, needing detection and resolution mechanisms.

 Strict variants reduce concurrency and increase transaction wait times.


2PL remains a foundational protocol in DBMS concurrency control.

If you want me to continue with more questions or add diagrams, just ask!
Here are 350-word answers on Deadlock Handling Techniques and Protocols
and Recovery Schemes in DBMS:

Deadlock Handling Techniques and Protocols (350 words)


Deadlock occurs in a DBMS when two or more transactions wait indefinitely
for each other to release locks, causing the system to stall. It typically arises
in environments with lock-based concurrency control, especially when
transactions hold locks while waiting for others to release theirs.
Deadlock Handling Techniques
1. Deadlock Prevention
This technique avoids deadlocks by imposing rules to prevent circular
wait conditions:
o Wait-Die Scheme: Older transaction waits for the younger one;
younger requesting older's lock is rolled back.
o Wound-Wait Scheme: Older transaction preempts (wounds)
younger transaction holding the lock; younger is rolled back.
o Resource Ordering: Assigns a unique order to all resources.
Transactions request locks in a predefined order to avoid circular
wait.
2. Deadlock Avoidance
It uses information about future requests to avoid deadlocks.
o Wait-For Graph (WFG): Tracks transactions waiting for others. If a
cycle is detected, it indicates a deadlock.
o Banker’s Algorithm: Similar to the OS method, it checks whether
granting a request keeps the system in a safe state.
3. Deadlock Detection and Recovery
Allows deadlocks to occur but detects and resolves them:

o Periodic construction of a Wait-For Graph. Cycles are checked.


o Once detected, one or more transactions in the cycle are rolled
back to break the deadlock.
o Victim Selection: Transaction with the lowest cost of rollback is
chosen.
o Starvation Avoidance: Ensures no transaction is repeatedly
chosen as victim.

Recovery Schemes in DBMS (350 words)


DBMS ensures durability and atomicity through recovery mechanisms that
restore the database to a consistent state after failure.

Types of Failures
 Transaction Failure: Logical errors or system crashes.

 System Crash: Power failure or OS crash.


 Media Failure: Disk crash or corruption.
Log-Based Recovery
Recovery depends on logs that store all modifications. There are two major
techniques:

1. Deferred Database Modification


o Updates are not applied to the database until the transaction
reaches the commit point.
o Before that, changes are recorded in the log file.
o If the system crashes before commit, changes are ignored during
recovery.

o If commit occurred, logs are replayed to redo the transaction.


Advantages: Simple; no undo required.
Disadvantages: Increased response time due to deferred writes.
2. Immediate Database Modification
o Changes are applied to the database as they are issued, before
commit.

o The log contains both before and after images.


o On recovery:
 Redo committed transactions.
 Undo uncommitted transactions using before-images.
Advantages: Faster execution.
Disadvantages: Complex recovery; both undo and redo required.
Checkpoints
A checkpoint is a snapshot of the current DB state. It reduces recovery time
by marking a consistent state from which recovery can proceed.
Shadow Paging
Another technique that avoids logging by maintaining two page tables:
current and shadow. Changes go to current pages. If a crash occurs, the
shadow pages are still intact.

These mechanisms ensure that DBMS maintains data integrity even during
unexpected failures or concurrency issues.
Here is a 350-word detailed answer covering the meaning of concurrency,
deadlock, and their types:

Concurrency and Deadlock in DBMS (350 Words)


Meaning of Concurrency:
Concurrency in DBMS refers to the simultaneous execution of multiple
transactions in a way that ensures data consistency and system correctness.
In a multi-user database system, several users may want to access the
database at the same time. Concurrency control ensures that the
simultaneous operations do not conflict with each other, and the integrity of
the database is maintained.
Example:
If two transactions are trying to update the same record simultaneously,
concurrency control ensures that only one succeeds at a time, or their effects
are serialized properly.
Goals of Concurrency Control:

 Maintain database consistency.

 Preserve isolation of transactions.


 Improve system throughput and resource utilization.

Meaning of Deadlock:
A deadlock is a situation where two or more transactions are waiting for each
other to release locks, and none of them can proceed. It is a serious problem
in concurrent transaction processing and can lead to system stalling.
Example:
Transaction T1 locks resource A and waits for B, while T2 locks B and waits for
A. Neither can proceed, leading to a deadlock.

Types of Deadlocks in DBMS:


1. Resource Deadlock:
Occurs when transactions wait indefinitely for resources (like tables or
rows) locked by each other.
2. Communication Deadlock:
Happens in distributed databases where two processes wait
indefinitely for a message from each other.
3. Detection-based Deadlock:
The system allows deadlocks to occur but detects them by constructing
a wait-for graph and resolving by rolling back transactions.
4. Prevention-based Deadlock:
Uses rules or protocols (e.g., wait-die, wound-wait) to ensure that
deadlocks do not occur by breaking one of the four Coffman
conditions.
5. Avoidance-based Deadlock:
System dynamically checks whether a state is safe before allowing
resource allocation. Banker’s algorithm is an example.

Here is a 300-word comprehensive answer covering SQL Commands including


basic, ALTER, UPDATE, aggregate functions, set theory commands (UNION,
INTERSECT, EXCEPT), subqueries, views, ODBC, JDBC, and triggers:

SQL Commands and Concepts (300 Words)


SQL (Structured Query Language) is used for managing and manipulating
relational databases. It includes various commands and features to handle
data efficiently.

Basic SQL Commands:


 SELECT: Retrieves data from one or more tables.
SELECT name FROM Employee;
 INSERT: Adds new data.
INSERT INTO Employee (id, name) VALUES (1, 'Alice');
 DELETE: Removes data.
DELETE FROM Employee WHERE id = 1;
ALTER Command:
Used to modify the structure of a table.
Example:
ALTER TABLE Employee ADD age INT;
UPDATE Command:
Modifies existing data in a table.
Example:
UPDATE Employee SET salary = salary + 1000 WHERE dept = 'HR';
Aggregate Functions:
 SUM(), AVG(), COUNT(), MAX(), MIN() are used for calculations.
Example:
SELECT AVG(salary) FROM Employee;
Set Theory Commands:

 UNION: Combines results and removes duplicates.

 INTERSECT: Shows common rows.


 EXCEPT: Shows rows from first query not in the second.
Example:
SELECT name FROM A UNION SELECT name FROM B;
Subqueries:
A query within another query.
Example:
SELECT name FROM Employee WHERE salary > (SELECT AVG(salary) FROM
Employee);
Views:
A view is a virtual table created by a query.
Example:
CREATE VIEW high_earners AS SELECT name FROM Employee WHERE salary >
50000;
ODBC and JDBC:
 ODBC (Open Database Connectivity) allows C/C++ applications to
access SQL databases.
 JDBC (Java Database Connectivity) allows Java programs to interact
with databases.

Triggers:
Triggers are automatic actions executed in response to specific events on a
table.
Example:
CREATE TRIGGER log_update AFTER UPDATE ON Employee FOR EACH ROW
INSERT INTO log_table VALUES (...);

You might also like