0% found this document useful (0 votes)
16 views59 pages

DBMS Mid2

This document provides a comprehensive overview of SQL queries, including their basic structure, use of arithmetic and logical operators, and various SQL functions. It also covers the creation of tables with constraints, nested and correlated queries, set operators, aggregate functions, and the use of GROUP BY, HAVING, and ORDER BY clauses. Each section includes examples to illustrate the concepts clearly.

Uploaded by

princykola97
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views59 pages

DBMS Mid2

This document provides a comprehensive overview of SQL queries, including their basic structure, use of arithmetic and logical operators, and various SQL functions. It also covers the creation of tables with constraints, nested and correlated queries, set operators, aggregate functions, and the use of GROUP BY, HAVING, and ORDER BY clauses. Each section includes examples to illustrate the concepts clearly.

Uploaded by

princykola97
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 59

UNIT 3

1(a).Explain the basic form of an SQL query.

The basic form of an SQL query follows a structured syntax to retrieve or manipulate data in a database. The
general structure of a SELECT query is:

SELECT column1, column2, ... FROM table_name


WHERE condition
GROUP BY column_name
HAVING condition
ORDER BY column_name [ASC|DESC];

Explanation of Each Clause:

1. SELECT – Specifies the columns to retrieve from the table.


2. FROM – Indicates the table from which data is fetched.
3. WHERE – Filters records based on a specified condition.
4. GROUP BY – Groups rows that have the same values in specified columns.
5. HAVING – Filters grouped records (used with GROUP BY).
6. ORDER BY – Sorts the result in ascending (ASC) or descending (DESC) order.

Example:
SELECT name, age, city FROM students WHERE age > 18 GROUP BY city HAVING COUNT(*) > 5
ORDER BY name ASC;

This query selects the name, age, and city of students who are older than 18, groups them by city, filters groups with
more than 5 students, and sorts them in ascending order by name.

1(b).Write SQL queries with arithmetic and logical operators

1. Using Arithmetic Operators (+, -, , /, %)

Arithmetic operators are used to perform mathematical operations on numeric values.

Example 1: Calculating total price (multiplication)

SELECT product_name, quantity, price_per_unit, (quantity * price_per_unit) AS total_price FROM orders;

● This query calculates the total price of each product ordered by multiplying the quantity with the price per
unit.

Example 2: Finding age after 5 years (addition)

SELECT name, age, (age + 5) AS age_after_5_years FROM students;

● Adds 5 years to each student's current age.

Example 3: Calculating discount price (subtraction)

SELECT product_name, price, (price - (price * 0.10)) AS discounted_price FROM products;

● Applies a 10% discount to the product price.

Example 4: Finding remainder (modulus)

SELECT employee_id, salary, (salary % 1000) AS remainder_salary FROM employees;

● Finds the remainder when salary is divided by 1000.

2. Using Logical Operators (AND, OR, NOT)


Logical operators are used to combine multiple conditions in SQL.

Example 5: Using AND (Both conditions must be true)

SELECT name, age, city FROM students

WHERE age > 18 AND city = 'Hyderabad';

● Selects students older than 18 who live in Hyderabad.

Example 6: Using OR (At least one condition must be true)

SELECT name, department, salary FROM employees

WHERE department = 'HR' OR salary > 50000;

● Selects employees who either work in HR or have a salary greater than 50,000.

Example 7: Using NOT (Negates a condition)

SELECT name, age, city FROM students

WHERE NOT city = 'Delhi';

● Selects students who do not live in Delhi.

Example 8: Combining AND, OR, and NOT

SELECT name, age, department, salary FROM employees

WHERE (age > 30 AND department = 'Finance') OR NOT salary < 40000;

● Selects employees who are either:


○ Older than 30 and work in Finance OR
○ Have a salary of 40,000 or more.

These queries show how arithmetic and logical operators are applied in SQL to filter and manipulate data efficiently

2.Explain about different SQL functions

SQL Functions:

SQL functions are built-in methods used to perform operations on data in a database. These functions can be
categorized into Aggregate Functions, String Functions, Date Functions, Numeric Functions, and Conversion
Functions.

1. Aggregate Functions

Aggregate functions operate on a set of values and return a single result.

Function Description

COUNT() Returns the number of rows

SUM() Adds up all values in a column


AVG() Returns the average value of a column

MAX() Returns the highest value in a column

MIN() Returns the lowest value in a column

Example:

SELECT COUNT(*) AS total_students, AVG(marks) AS average_marks FROM students;

2. String Functions

String functions manipulate text data.

Function Description

UPPER() Converts text to uppercase

LOWER() Converts text to lowercase

LENGTH() Returns the length of a string

CONCAT() Combines two or more strings

SUBSTRING() Extracts a part of a string

TRIM() Removes spaces from the beginning and end

Example:

SELECT UPPER(name) AS name_in_caps, LENGTH(name) AS name_length FROM employees;

3. Date Functions

Date functions manipulate date and time values.

Function Description

NOW() Returns the current date and time


CURDATE() Returns the current date

CURTIME() Returns the current time

DATEADD() Adds a specific interval to a date

DATEDIFF() Returns the difference between two dates

DAY(), MONTH(), YEAR() Extracts day, month, or year

Example:

SELECT CURDATE() AS today, YEAR(CURDATE()) AS current_year;

4. Numeric Functions

Numeric functions perform mathematical operations.

Function Description

ROUND() Rounds a number to a specified decimal place

CEIL() Rounds up to the next whole number

FLOOR() Rounds down to the nearest whole number

ABS() Returns the absolute value

MOD() Returns the remainder of division

Example:

SELECT price, ROUND(price, 2) AS rounded_price, ABS(-10) AS absolute_value FROM products;

5. Conversion Functions

Conversion functions convert data from one type to another.


Function Description

CAST() Converts a value to a specified type

CONVERT() Converts a value to a specified type

Example:

SELECT CAST(123.45 AS INT) AS integer_value, CONVERT('2025-02-24', DATE) AS date_value;

3.Create tables using unique, primary key, check and foreign key constraints.

1. Creating the students Table

CREATE TABLE students (

student_id INT PRIMARY KEY, -- Primary Key Constraint

name VARCHAR(50) NOT NULL,

age INT CHECK (age >= 18), -- Check Constraint (age must be 18 or above)

email VARCHAR(100) UNIQUE -- Unique Constraint (ensures no duplicate emails)

);

● PRIMARY KEY (student_id): Ensures each student has a unique identifier.


● CHECK (age >= 18): Ensures students must be at least 18 years old.
● UNIQUE (email): Ensures no two students have the same email.

2. Creating the courses Table

CREATE TABLE courses (

course_id INT PRIMARY KEY, -- Primary Key Constraint

course_name VARCHAR(100) UNIQUE NOT NULL, -- Unique Constraint

credits INT CHECK (credits BETWEEN 1 AND 10) -- Check Constraint (valid credit range)

);

● PRIMARY KEY (course_id): Ensures each course has a unique ID.


● UNIQUE (course_name): Ensures course names are unique.
● CHECK (credits BETWEEN 1 AND 10): Restricts the number of credits between 1 and 10.

3. Creating the enrollments Table (Using FOREIGN KEY)

CREATE TABLE enrollments (

enrollment_id INT PRIMARY KEY, -- Primary Key Constraint

student_id INT, course_id INT,

enrollment_date DATE NOT NULL DEFAULT CURRENT_DATE,

FOREIGN KEY (student_id) REFERENCES students(student_id) ON DELETE CASCADE,


FOREIGN KEY (course_id) REFERENCES courses(course_id) ON DELETE SET NULL

);

● FOREIGN KEY (student_id) REFERENCES students(student_id) ON DELETE CASCADE: If a student is


deleted, their enrollments are also deleted.
● FOREIGN KEY (course_id) REFERENCES courses(course_id) ON DELETE SET NULL: If a course is
deleted, its enrollments remain but with a NULL course ID.

Summary of Constraints Used

Constraint Description

PRIMARY KEY Ensures a unique identifier for each row.

UNIQUE Ensures no duplicate values in a column.

CHECK Enforces a condition on column values.

FOREIGN KEY Links a column to another table to maintain


referential integrity.

4(a).Explain about nested and correlated nested queries.

Nested Queries in SQL

A nested query (also known as a subquery) is a query that is placed inside another SQL query. The inner query is
executed first, and its result is used by the outer query.

Types of Nested Queries

1. Simple Nested Query


2. Correlated Nested Query

1. Simple Nested Queries (Independent Subqueries)

A simple nested query runs independently, meaning the inner query executes first, and then its result is used by
the outer query.

Example: Find students who have the highest marks

SELECT name, marks FROM students

WHERE marks = (SELECT MAX(marks) FROM students);

● The inner query SELECT MAX(marks) FROM students; returns the highest marks.
● The outer query then selects students whose marks match the result.

Example: Find employees working in the 'HR' department

SELECT name, salary FROM employees

WHERE department_id = (SELECT department_id FROM departments WHERE department_name = 'HR');


● The inner query fetches the department_id of 'HR'.
● The outer query retrieves employees who belong to that department.

2. Correlated Nested Queries

A correlated nested query is dependent on the outer query, meaning the inner query runs for each row processed
by the outer query.

Example: Find employees earning more than the average salary of their department

SELECT name, department_id, salary FROM employees e1

WHERE salary > (SELECT AVG(salary) FROM employees e2 WHERE e1.department_id = e2.department_id);

● The inner query calculates the average salary for the specific department (e2.department_id).
● The outer query checks which employees (e1.salary) earn more than that department’s average.

Example: Find students who have scored above the average marks of their class

SELECT name, marks FROM students s1

WHERE marks > (SELECT AVG(marks) FROM students s2 WHERE s1.class_id = s2.class_id);

● The inner query calculates the average marks for the student's class.
● The outer query filters students who scored above their class average.

Key Differences Between Nested and Correlated Nested Queries

Feature Simple Nested Query Correlated Nested Query

Execution Inner query runs once before the outer Inner query runs for each row of the outer query
query

Dependency Inner query is independent Inner query depends on outer query

Performance Generally faster Slower due to repeated execution

Use Case When filtering based on a single value When filtering based on row-wise conditions
from inner query

Conclusion

● Use simple nested queries when you need a single value from the inner query.
● Use correlated nested queries when the inner query depends on each row of the outer query

4(b).Discuss set operators with examples

Set operators in SQL are used to combine the results of two or more SELECT queries. The major set operators
in SQL are:

1. UNION
2. UNION ALL
3. INTERSECT
4. EXCEPT (or MINUS in some databases)

Rules for Using Set Operators

● The number of columns in both queries must be the same.


● The data types of corresponding columns must be compatible.
● ORDER BY is applied at the end of the combined result.

1. UNION

The UNION operator combines results from two queries and removes duplicate rows.

Example: Get a list of all students and teachers

SELECT name FROM students UNION SELECT name FROM teachers;

● If a person is both a student and a teacher, their name appears only once.

2. UNION ALL

The UNION ALL operator works like UNION, but does not remove duplicates.

Example: Get all students and teachers, including duplicates

SELECT name FROM students UNION ALL SELECT name FROM teachers;

● Unlike UNION, this query keeps duplicates.

3. INTERSECT

The INTERSECT operator returns only common rows between two queries.

Example: Find names that are both in students and teachers

SELECT name FROM students INTERSECT SELECT name FROM teachers;

● Returns only names present in both tables.

4. EXCEPT (or MINUS)

The EXCEPT operator returns rows from the first query that are not in the second query.

Example: Find students who are not teachers

SELECT name FROM students EXCEPT SELECT name FROM teachers;

● Returns names that exist in students but not in teachers.

Operator Removes Duplicates Returns

UNION Yes Combines results without duplicates

UNION ALL No Combines results with duplicates


INTERSECT Yes Common rows in both queries

EXCEPT (MINUS) Yes Rows from first query not in second

5(a).Illustrate different aggregate functions in SQL with examples

Aggregate functions in SQL perform calculations on a set of values and return a single result. These functions are
commonly used in SELECT statements, often with the GROUP BY clause.

1. COUNT() – Counting Rows

The COUNT() function returns the total number of rows that match a condition.

Example: Count total students

SELECT COUNT(*) AS total_students FROM students;

● This query counts all students in the students table.

Example: Count students in a specific department

SELECT COUNT(*) AS cs_students FROM students WHERE department = 'CSE';

● Counts only students in the "CSE" department.

2. SUM() – Adding Values

The SUM() function adds up all values in a column.

Example: Calculate total salary of employees

SELECT SUM(salary) AS total_salary FROM employees;

● Returns the sum of all employee salaries.

Example: Total sales for a specific product

SELECT SUM(amount) AS total_sales FROM orders WHERE product_id = 101;

● Calculates total sales for product 101.

3. AVG() – Calculating the Average

The AVG() function returns the average value of a numeric column.

Example: Find the average marks of students

SELECT AVG(marks) AS average_marks FROM students;

● Returns the average marks of all students.

Example: Find the average salary of employees in each department

SELECT department, AVG(salary) AS avg_salary FROM employees GROUP BY department;

● Groups employees by department and calculates average salary for each department.

4. MAX() – Finding the Maximum Value


The MAX() function returns the highest value in a column.

Example: Find the highest salary in the company

SELECT MAX(salary) AS highest_salary FROM employees;

● Returns the highest salary in the employees table.

Example: Find the most expensive product

SELECT product_name, MAX(price) AS max_price FROM products;

● Returns the highest-priced product.

5. MIN() – Finding the Minimum Value

The MIN() function returns the lowest value in a column.

Example: Find the lowest salary in the company

SELECT MIN(salary) AS lowest_salary FROM employees;

● Returns the lowest salary.

Example: Find the cheapest product

SELECT product_name, MIN(price) AS min_price FROM products;

● Returns the least expensive product.

Using Multiple Aggregate Functions Together

You can combine multiple aggregate functions in a single query.

Example: Get total, average, highest, and lowest salary

SELECT COUNT(*) AS total_employees, SUM(salary) AS total_salary, AVG(salary) AS average_salary,

MAX(salary) AS highest_salary, MIN(salary) AS lowest_salary FROM employees;

● Retrieves multiple statistics about employee salaries in one query.

5(b).Explain group by, having and ordering clauses.

SQL provides powerful clauses like GROUP BY, HAVING, and ORDER BY to organize and filter query results.

1. GROUP BY Clause

The GROUP BY clause is used to group rows with the same values in one or more columns and perform
aggregate functions like COUNT(), SUM(), AVG(), etc.

SELECT column_name, aggregate_function(column_name)

FROM table_name

GROUP BY column_name;

Example: Count the number of employees in each department

SELECT department, COUNT(*) AS total_employees

FROM employees
GROUP BY department;

● Groups employees by department and counts the number of employees in each department.

2. HAVING Clause

The HAVING clause is used to filter grouped results based on an aggregate function. It is similar to WHERE, but
WHERE cannot be used with aggregate functions.

SELECT column_name, aggregate_function(column_name)

FROM table_name

GROUP BY column_name

HAVING condition;

Example: Find departments with more than 5 employees

SELECT department, COUNT(*) AS total_employees

FROM employees

GROUP BY department

HAVING COUNT(*) > 5;

● First, it groups employees by department.


● Then, it filters only departments where the count of employees is greater than 5.

Difference Between WHERE and HAVING

● WHERE filters before grouping.


● HAVING filters after grouping.

3. ORDER BY Clause

The ORDER BY clause is used to sort the result set in ascending (ASC) or descending (DESC) order.

SELECT column_name

FROM table_name

ORDER BY column_name [ASC|DESC];

Example: Sort employees by salary in descending order

SELECT name, salary

FROM employees

ORDER BY salary DESC;

● Orders employees by salary from highest to lowest.

Example: Sort employees first by department (A-Z), then by salary (high to low)

SELECT name, department, salary FROM employees ORDER BY department ASC, salary DESC;

● First sorts by department (A-Z).


● Then, sorts salaries within each department (high to low).
Combining GROUP BY, HAVING, and ORDER BY

You can use all three clauses together in a query.

Example: Find departments with more than 5 employees and sort them by total employees (descending)

SELECT department, COUNT(*) AS total_employees FROM employees

GROUP BY department

HAVING COUNT(*) > 5

ORDER BY total_employees DESC;

● Groups employees by department.


● Filters departments where employee count is greater than 5.
● Sorts the results by employee count in descending order.

Clause Purpose

GROUP BY Groups rows based on column values.

HAVING Filters grouped rows based on aggregate functions.

ORDER BY Sorts the result set in ascending (ASC) or descending (DESC) order.

6(a).Discuss about joins

Joins in SQL are used to combine rows from two or more tables based on a related column between them. They
help retrieve meaningful information from multiple tables efficiently.

Types of Joins

1. INNER JOIN
○ Returns only the matching records from both tables.
○ Non-matching rows are excluded.

Syntax:
SELECT column_names FROM table1 INNER JOIN table2 ON table1.common_column = table2.common_column;

Example:
SELECT employees.name, departments.dept_name FROM employees INNER JOIN departments ON
employees.dept_id = departments.dept_id;

2. LEFT JOIN (LEFT OUTER JOIN)


○ Returns all records from the left table and matching records from the right table.
○ If no match is found, NULL is returned for columns from the right table.

Syntax:
SELECT column_names FROM table1 LEFT JOIN table2 ON table1.common_column = table2.common_column;

Example:
SELECT employees.name, departments.dept_name FROM employees LEFT JOIN departments

ON employees.dept_id = departments.dept_id;
3. RIGHT JOIN (RIGHT OUTER JOIN)
○ Returns all records from the right table and matching records from the left table.
○ If no match is found, NULL is returned for columns from the left table.

Syntax:
SELECT column_names FROM table1 RIGHT JOIN table2 ON table1.common_column = table2.common_column;

Example:
SELECT employees.name, departments.dept_name FROM employees RIGHT JOIN departments

ON employees.dept_id = departments.dept_id;

4. FULL JOIN (FULL OUTER JOIN)


○ Returns all records when there is a match in either table.
○ If there is no match, NULL values are returned for missing data.

Syntax:
SELECT column_names FROM table1 FULL JOIN table2 ON table1.common_column = table2.common_column;

Example:
SELECT employees.name, departments.dept_name FROM employees FULL JOIN departments ON
employees.dept_id = departments.dept_id;

5. CROSS JOIN
○ Produces a Cartesian product, where each row from the first table is combined with every row from
the second table.

Syntax:
SELECT column_names FROM table1 CROSS JOIN table2;

Example:
SELECT employees.name, departments.dept_name FROM employees CROSS JOIN departments;

6. SELF JOIN
○ A table joins itself to compare rows within the same table.
○ Uses an alias to differentiate table instances.

Syntax:
SELECT A.column_name, B.column_name FROM table_name A, table_name B WHERE condition;

Example:
SELECT A.employee_name, B.employee_name AS Manager FROM employees A, employees B

WHERE A.manager_id = B.employee_id;

6(b).Describe the problems of null values and outer joins.

1. Problems of NULL Values

A NULL value in SQL represents missing or unknown data. While NULL values help handle incomplete data, they
introduce several challenges:

a) Issues in Arithmetic Operations

● Any arithmetic operation involving NULL results in NULL.

Example: SELECT salary + bonus AS total_income FROM employees;

If bonus is NULL, total_income will also be NULL, which can cause incorrect calculations.

b) Issues in Comparisons

● Comparisons involving NULL using = or != return UNKNOWN, not TRUE or FALSE.


Example: SELECT * FROM employees WHERE salary = NULL;

This returns no results because NULL = NULL is UNKNOWN.

Instead, IS NULL should be used:

SELECT * FROM employees WHERE salary IS NULL;

c) Problems with Aggregation Functions

● Functions like SUM(), AVG(), COUNT(), etc., ignore NULL values.

Example: SELECT AVG(salary) FROM employees;

If some employees have NULL salaries, they are excluded from the calculation, potentially skewing results.

d) Issues with Joins and Filtering

● When using joins, NULL values in key columns can prevent proper matching.

Example: SELECT * FROM employees INNER JOIN departments ON employees.dept_id = departments.dept_id;

Employees with NULL dept_id will not appear in the result.

e) Challenges in Constraints and Indexing

● NULL values can affect unique constraints.


● Example: A column with a UNIQUE constraint can have multiple NULL values because NULL is not
considered a duplicate.

2. Problems of Outer Joins (LEFT, RIGHT, FULL OUTER JOIN)

Outer joins (LEFT, RIGHT, FULL) return unmatched rows with NULL values from one or both tables. While useful,
they can introduce issues:

a) NULLs in Result Set

● When there is no match, columns from the unmatched table return NULL.

Example (LEFT JOIN): SELECT employees.name, departments.dept_name FROM employees

LEFT JOIN departments ON employees.dept_id = departments.dept_id;

If an employee has no department assigned, dept_name will be NULL.

b) Incorrect Assumptions in Filtering

● Conditions in WHERE can eliminate NULL rows unintentionally.

Example: SELECT employees.name, departments.dept_name FROM employees LEFT JOIN departments ON


employees.dept_id = departments.dept_id WHERE departments.dept_name <> 'HR';

○ This removes NULL values, making it behave like an INNER JOIN.


○ Instead, use OR departments.dept_name IS NULL to keep unmatched records.

c) Performance Overhead

● FULL OUTER JOIN can be slow on large datasets because it returns all records from both tables, filling
unmatched rows with NULL.

d) Complex Query Logic

● When dealing with NULLs in outer joins, additional conditions (COALESCE(), CASE, IS NULL) are often
needed to handle missing values properly.
Example: SELECT name, COALESCE(dept_name, 'No Department') AS department

FROM employees LEFT JOIN departments ON employees.dept_id = departments.dept_id;

This replaces NULL with a default value.

7.Explain about views in detail.

A view in SQL is a virtual table based on the result of a SQL query. It does not store data physically but provides a
stored query that can be executed when needed. Views help in simplifying complex queries, improving security, and
maintaining data abstraction.

Key Features of Views

● A view is created using the CREATE VIEW statement.


● It retrieves data dynamically from the underlying tables.
● It can be used in queries just like a regular table.
● It helps in hiding complex queries and restricting access to certain columns or rows.
● Views can be updated if they meet certain conditions.

Creating a View

A view is created using the CREATE VIEW statement.

Syntax

CREATE VIEW view_name AS SELECT column1, column2, … FROM table_name WHERE condition;

Example

Consider a table named employees:

emp_id name department salary

101 Alice HR 50000

102 Bob IT 60000

103 Charlie Sales 55000

We create a view that shows only the employees working in the IT department:

CREATE VIEW IT_Employees AS SELECT emp_id, name, salar FROM employees

WHERE department = 'IT';

Now, querying the view gives: SELECT * FROM IT_Employees;

Emp_id name salary

102 Bob 60000

Types of Views
1. Simple Views

● Based on a single table.


● Can be updated if it includes all required primary key columns.

Example:
CREATE VIEW High_Salary AS SELECT name, salary FROM employees WHERE salary > 55000;

2. Complex Views

● Based on multiple tables using JOIN.


● May contain aggregate functions (SUM(), AVG(), etc.).
● Usually not updatable.

Example:
CREATE VIEW Employee_Department AS

SELECT employees.name, employees.salary, departments.dept_name FROM employees

JOIN departments ON employees.department = departments.dept_id;

3. Inline Views

● Temporary views used inside a SQL statement (often in FROM clause).

Example:
SELECT AVG(salary) FROM

(SELECT salary FROM employees WHERE department = 'IT') AS IT_Salaries;

4. Materialized Views

● Unlike regular views, materialized views store data physically for better performance.
● Used for large queries that don’t need frequent updates.
● Requires manual refresh.

Example (specific to databases like Oracle):


CREATE MATERIALIZED VIEW EmployeeSummary AS

SELECT department, AVG(salary) AS AvgSalary FROM employees GROUP BY department;

Modifying Views

Updating a View

● Views can be modified using CREATE OR REPLACE VIEW.

Example: CREATE OR REPLACE VIEW IT_Employees AS SELECT emp_id, name, salary, department FROM
employees WHERE department = 'IT';

Deleting a View

● Use DROP VIEW to remove a view.

Example: DROP VIEW IT_Employees;

Advantages of Views

1. Data Security

● Restricts access to sensitive columns by displaying only necessary data.


Example: Hiding salary details:
CREATE VIEW EmployeePublic AS SELECT emp_id, name, department FROM employees;

2. Simplifies Complex Queries

● Stores frequently used queries in a simple format.

Example:
CREATE VIEW SalesReport AS SELECT sales.date, customers.name, sales.amount FROM sales

JOIN customers ON sales.customer_id = customers.customer_id;

3. Data Consistency

● Provides a consistent interface even if the underlying table structure changes.

4. Reduces Redundancy

● Helps avoid duplicate query writing.

Limitations of Views

1. Performance Issues
○ Since views are virtual tables, each query runs on the base table.
○ For better performance, materialized views are preferred.
2. Cannot Modify Certain Views
○ Complex views with JOIN, GROUP BY, or DISTINCT cannot be updated directly.

Example: UPDATE Employee_Department SET salary = 70000 WHERE name = 'Alice'; -- This might fail if the view
has a JOIN

3. Dependent on Base Tables


○ If a base table is deleted, the view becomes invalid.

Views are a powerful SQL feature that enhance security, simplify queries, and improve data abstraction. While they
offer advantages in query efficiency and user access control, they also come with limitations like update restrictions
and performance concerns. Understanding how and when to use views is essential for efficient database
management.

8.Write SQL queries using the following relational database. Sailors (sid:integer,
sname:string, rating:integer, age:real),Boats (bid:integer, bname:string,
color:string),Reserves (sid:integer, bid:integer, day:date)

a) Find the average age of sailors with a rating of 10.


b) Find the name and age of the oldest sailor.
c) Find the sailors with the highest rating.
d) Find the names of sailors who are older than the oldest sailor with a rating of 10.
e) Find the age of the youngest sailor who is eligible to vote for each rating level.

a)
SELECT AVG(age) AS avg_age FROM Sailors
WHERE rating = 10;

● This query calculates the average age of sailors who have a rating of 10.

b)
SELECT sname, age
FROM Sailors
WHERE age = (SELECT MAX(age) FROM Sailors);
● This query first finds the maximum age from the Sailors table and retrieves the sname and age of the
sailor(s) with that age.

c)
SELECT sid, sname, rating FROM Sailors
WHERE rating = (SELECT MAX(rating) FROM Sailors);

● This query finds the highest rating using MAX(rating) and retrieves all sailors who have that rating.

d)
SELECT sname FROM Sailors
WHERE age > (SELECT MAX(age) FROM Sailors WHERE rating = 10);

● The subquery finds the maximum age among sailors with a rating of 10.
● The outer query retrieves the names of sailors whose age is greater than that.

e) (Assuming the voting age is 18 years)


SELECT rating, MIN(age) AS youngest_voter_age FROM Sailors
WHERE age >= 18 GROUP BY rating;
The query finds the minimum age of sailors who are at least 18 years old, grouped by rating.
9.Write SQL queries using the following relational database. Sailors (sid:integer,
sname:string, rating:integer, age:real) Boats (bid:integer, bname:string, color:string)
Reserves (sid:integer, bid:integer, day:date)
a) Find the names of sailors who have reserved boat 100.
b) Find the names of sailors who have reserved a red or a green boat.
c) Find the colors of boats reserved by the sailor Anil.
d) Find the sids of sailors with age over 20 who have not reserved a red boat.
e) Find the names of sailors who have reserved all boats.

a)
SELECT DISTINCT s.sname FROM Sailors s
JOIN Reserves r ON s.sid = r.sid WHERE r.bid = 100;

● This query finds sailors (sname) who have reserved the boat with bid = 100 using a JOIN between Sailors
and Reserves.

b)
SELECT DISTINCT s.sname FROM Sailors s
JOIN Reserves r ON s.sid = r.sid
JOIN Boats b ON r.bid = b.bid
WHERE b.color IN ('Red', 'Green');

● This query finds sailors (sname) who have reserved boats that are either red or green, using a JOIN
between Sailors, Reserves, and Boats.

c)
SELECT DISTINCT b.color FROM Boats b
JOIN Reserves r ON b.bid = r.bid
JOIN Sailors s ON r.sid = s.sid
WHERE s.sname = 'Anil';

● This query retrieves all distinct colors of boats reserved by the sailor Anil.

d)
SELECT DISTINCT s.sid FROM Sailors s WHERE s.age > 20 AND s.sid NOT IN (
SELECT r.sid FROM Reserves r JOIN Boats b ON r.bid = b.bid
WHERE b.color = 'Red');

● This query selects sailors (sid) whose age > 20 and who have not reserved a red boat.
● The subquery retrieves sid of sailors who have reserved a red boat, and NOT IN ensures exclusion.

e)
SELECT s.sname FROM Sailors s WHERE NOT EXISTS (
SELECT b.bid
FROM Boats b
WHERE NOT EXISTS (
SELECT r.bid
FROM Reserves r
WHERE r.bid = b.bid AND r.sid = s.sid ) );

● This query uses double NOT EXISTS to find sailors who have reserved every boat in the Boats table.
● The inner query ensures that for each boat in Boats, there is at least one reservation by the sailor.

10.Write SQL queries using the following relational database. Students(sno:integer,


sname:string, age:integer, cid:integer) Enrolled (cid:integer, cname:string, fid:integer)
Faculty (fid:integer, fname:string, dept:string)

a) Find the names of students who are enrolled in a class taught by Harish.
b) Find the age of oldest student.
c) Find the names of students enrolled in History.
d) Find the department of faculty whose name starts with ‘s’.
e) Find the names of students who are enrolled in a class and age is over 17 taught by
Harish.

a)
SELECT DISTINCT s.sname FROM Students s
JOIN Enrolled e ON s.cid = e.cid
JOIN Faculty f ON e.fid = f.fid WHERE f.fname = 'Harish';

● This query finds students (sname) who are enrolled in a class taught by Harish using a JOIN between
Students, Enrolled, and Faculty.

b)
SELECT MAX(age) AS oldest_age FROM Students;

● This query retrieves the maximum age from the Students table.

c)
SELECT DISTINCT s.sname FROM Students s
JOIN Enrolled e ON s.cid = e.cid
WHERE e.cname = 'History';

● This query finds students (sname) who are enrolled in History by checking the cname column in Enrolled.

d)
SELECT DISTINCT dept FROM Faculty WHERE fname LIKE 'S%';

● This query retrieves departments of faculty members whose names start with ‘S’, using LIKE 'S%'.

e)
SELECT DISTINCT s.sname FROM Students s
JOIN Enrolled e ON s.cid = e.cid
JOIN Faculty f ON e.fid = f.fid
WHERE f.fname = 'Harish' AND s.age > 17;

● This query retrieves students (sname) who are: Enrolled in a class ;Age is greater than 17 ;Taught by
Harish

11.Consider the following relational schemas Sailors (sid:integer, sname:string,


rating:integer, age:real) Boats (bid:integer, bname:string, color:string) Reserves
(sid:integer, bid:integer, day:date)
Based on the above schema, write the SQL queries for the following.
i) Find the colors of boats reserved by the sailor Anil.
ii) Find the names of sailors whose age is more than 20.
iii) Find the names of sailors who have reserved a red or green boat.
iv) Find the names of the sailors who have reserved both a Red boat and a Green boat.
v)Find names of sailors who have reserved all boats

i)
SELECT DISTINCT b.color FROM Boats b
JOIN Reserves r ON b.bid = r.bid JOIN Sailors s ON r.sid = s.sid
WHERE s.sname = 'Anil';

● This query finds distinct colors of boats reserved by the sailor Anil using JOINs.

ii)
SELECT sname FROM Sailors WHERE age > 20;

● This query retrieves sailors’ names whose age is greater than 20.

iii)
SELECT DISTINCT s.sname FROM Sailors s JOIN Reserves r ON s.sid = r.sid
JOIN Boats b ON r.bid = b.bid WHERE b.color IN ('Red', 'Green');

● This query retrieves sailors (sname) who have reserved either a red or green boat.

iv)
SELECT DISTINCT s.sname FROM Sailors s WHERE s.sid IN (
SELECT r1.sid FROM Reserves r1 JOIN Boats b1 ON r1.bid = b1.bid
WHERE b1.color = 'Red')
AND s.sid IN ( SELECT r2.sid FROM Reserves r2 JOIN Boats b2 ON r2.bid = b2.bid WHERE b2.color = 'Green');

● This query ensures that a sailor has reserved both a Red boat AND a Green boat by using two
subqueries.

v)
SELECT s.sname FROM Sailors s
WHERE NOT EXISTS ( SELECT b.bid FROM Boats b WHERE NOT EXISTS ( SELECT r.bid
FROM Reserves r
WHERE r.bid = b.bid AND r.sid = s.sid )
);

● This query finds sailors who have reserved every boat in the Boats table using double NOT EXISTS.

UNIT 4
1.Explain about the problems caused by Redundancy
Redundancy in databases and information systems refers to the unnecessary duplication of data, which can lead to
several issues in data management and integrity. Some of the key problems caused by redundancy include:

1. Increased Storage Space


○ When data is duplicated, it consumes more storage space, leading to higher costs for data storage
and management. This is especially problematic in large databases where multiple copies of the
same data exist.
2. Data Inconsistency
○ If redundant data is not updated properly, it may lead to inconsistencies. For example, if a customer's
address is stored in multiple places and one copy is updated while others are not, different records
may contain conflicting information.
3. Data Anomalies
○ Insertion Anomaly: Redundant data may prevent new records from being added without
unnecessary duplicate information.
○ Update Anomaly: Changes to data must be made in multiple places, increasing the chance of errors.
○ Deletion Anomaly: Deleting data from one table might remove critical information if not managed
properly.
4. Decreased Data Integrity
○ Redundant data can lead to integrity issues when different versions of the same data exist, making it
difficult to determine which version is accurate.
5. Slower Query Performance
○ When redundant data exists, database queries may take longer to execute because the system has
to scan and process more data than necessary. This affects overall system efficiency.
6. Higher Maintenance Costs
○ Managing redundant data requires additional effort, increasing administrative costs and the workload
for database administrators. Regular updates and checks must be performed to ensure data
consistency.
7. Complications in Data Normalization
○ In relational databases, redundancy violates normalization principles, leading to complex
relationships and difficulty in maintaining a well-structured database.

Solution: Normalization

To avoid redundancy and its associated problems, database normalization techniques are used. Normalization
involves organizing data into related tables to minimize duplication while ensuring data integrity and efficiency.

By reducing redundancy, organizations can improve data consistency, enhance performance, and lower storage and
maintenance costs.

2(a).Discuss the problems related to decompositions

In database design, decomposition refers to breaking a large relation (table) into smaller relations to remove
redundancy and ensure normalization. While decomposition helps in eliminating redundancy and anomalies, it can
also introduce several challenges:

1. Lossless Decomposition Issue

● A decomposition must be lossless to ensure that no data is lost when splitting relations.
● If the decomposition is not lossless, it may become impossible to reconstruct the original relation correctly.
● Solution: Ensure that the common attribute (joining key) in decomposed relations maintains sufficient
information to reconstruct the original table.

2. Dependency Preservation Problem

● Functional dependencies define relationships between attributes in a table. When a table is decomposed,
some dependencies may be lost.
● If dependencies are not preserved, queries may require complex joins to retrieve missing data.
● Solution: Choose a decomposition that preserves all functional dependencies to maintain database integrity.

3. Increased Join Operations

● Decomposition may require frequent join operations to retrieve data, leading to performance issues.
● A poorly designed decomposition can slow down queries, especially for large datasets.
● Solution: Ensure decomposition balances normalization and efficiency by minimizing unnecessary joins.

4. Redundant Data in Some Cases

● Sometimes, decomposition may lead to redundancy if not designed correctly.


● If an attribute is stored in multiple tables to maintain functional dependencies, it can reintroduce redundancy.
● Solution: Use higher normal forms carefully to avoid redundancy while maintaining efficiency.

5. Loss of Semantic Meaning

● Splitting a relation into multiple tables may result in loss of context or meaning.
● Users may find it difficult to understand how different decomposed tables relate to each other.
● Solution: Maintain clear relationships between decomposed tables and ensure proper documentation.

Conclusion

Decomposition is necessary for database normalization but must be performed carefully to avoid issues like loss of
data, dependency violations, and performance degradation. A well-structured database design ensures that
decomposition achieves both efficiency and data integrity while minimizing unnecessary complexity.

2(b).Explain about Functional Dependencies.

A functional dependency (FD) is a constraint between two sets of attributes in a relational database. It describes
how the value of one attribute (or a set of attributes) determines the value of another attribute.

Notation

A functional dependency is denoted as: X→Y

Where:

● X (determinant): The attribute or set of attributes that determine another attribute.


● Y (dependent): The attribute whose value is determined by X.

Example:
In a STUDENT table, if Student_ID uniquely determines Student_Name, then: {Student_ID}-->{Student_Name}

This means that for each Student_ID, there is only one Student_Name.

Types of Functional Dependencies

1. Trivial Functional Dependency


○ A functional dependency is trivial if the dependent attribute is a subset of the determinant.
○ Example: (Student_ID, Student_Name)-->{Student_ID} Here, Student_ID is already part of the left
side, so it is trivial.
2. Non-Trivial Functional Dependency
○ A functional dependency is non-trivial if the dependent attribute is not a subset of the determinant.
○ Example: Student_ID→Student_Name Since Student_Name is not part of Student_ID, it is a
non-trivial dependency.
3. Partial Functional Dependency
○ If a non-prime attribute is functionally dependent on part of a candidate key, it is a partial
dependency.
○ Example: In a relation (Student_ID, Course_ID → Student_Name, Course_Name),
Student_ID→ Student_Name is a partial dependency because Student_ID alone determines
Student_Name but not Course_Name.
4. Transitive Functional Dependency
○ If X → Y and Y → Z, then X → Z is a transitive dependency.
○ Example: If Student_ID→Department_ID and Department_ID→Department_Name then
Student_ID→Department_Name is a transitive dependency.
5. Multivalued Dependency (MVD)
○ If an attribute in a table has multiple independent values associated with another attribute, it is an
MVD.
○ Example: In a STUDENT table, if a student has multiple hobbies, Student_ID→—>Hobby
○ represents a multivalued dependency.

Importance of Functional Dependencies

Functional dependencies are crucial in:

● Database Normalization: Helps eliminate redundancy and anomalies.


● Determining Keys: Used to identify candidate keys and primary keys.
● Ensuring Data Integrity: Prevents inconsistency in data storage and retrieval.

Conclusion

Functional dependencies play a vital role in relational database design by ensuring data consistency, reducing
redundancy, and improving efficiency. Understanding FDs helps in normalization and structuring optimized
database schemas.

3.Distinguish different normal forms.

In database normalization, different normal forms (NF) help reduce redundancy and improve data integrity. The
main normal forms are:

1. First Normal Form (1NF)

A table is in 1NF if:

● All attributes contain atomic values (i.e., no multi-valued or composite attributes).


● Each column contains values of a single type.
● Each row has a unique identifier (Primary Key).

Example (Before 1NF - Multi-valued attributes):

Student_ID Name Subjects

101 Alice Math, Physics

102 Bob Chemistry

After 1NF (Atomic values in separate rows):

Student_ID Name Subject

101 Alice Math

101 Alice Physics

102 Bob Chemistry

2. Second Normal Form (2NF)

A table is in 2NF if:

● It is already in 1NF.
● No partial dependency exists (i.e., non-key attributes should depend on the whole primary key, not just a
part of it).

Example (Before 2NF - Partial Dependency):

Student_ID Course_ID Student_Name Course_Name

101 C1 Alice Math

102 C2 Bob Physics

Here, Student_Name depends only on Student_ID, not on the full (Student_ID, Course_ID) key.

After 2NF (Splitting into two tables):


Student Table

Student_ID Student_Name

101 Alice

102 Bob

Course Table

Course_ID Course_Name

C1 Math

C2 Physics

Enrollment Table (Bridging Table)

Student_ID Course_ID

101 C1

102 C2

3. Third Normal Form (3NF)

A table is in 3NF if:

● It is in 2NF.
● No transitive dependency exists (i.e., non-key attributes should depend only on the primary key, not on
other non-key attributes).
Example (Before 3NF - Transitive Dependency):

Student_ID Student_Name Department HOD

101 Alice CS Dr. Smith

102 Bob EE Dr. Brown

Here, HOD depends on Department, not directly on Student_ID.

After 3NF (Separate Department Table):


Student Table Department Table

Student_ID Student_Name Department


Department HOD
101 Alice CS
CS Dr. Smith
102 Bob EE
EE Dr. Brown
4. Boyce-Codd Normal Form (BCNF)

A table is in BCNF if:

● It is in 3NF.
● Every determinant (a field that determines another field) is a candidate key.

Example (Before BCNF - Violation due to overlapping candidate keys):

Teacher_ID Course Room

T1 Math R1

T2 Physics R2

T1 Physics R3

Here, (Teacher_ID, Course) is a candidate key, but Room also determines


Course, leading to redundancy.
Course Room
After BCNF (Splitting into Two Tables):
Teacher_Course Table Course_Room Table
Math R1

Physics R2
Teacher_ID Course

Physics R3
T1 Math

T2 Physics

T1 Physics

5. Fourth Normal Form (4NF)

A table is in 4NF if:

● It is in BCNF.
● No multi-valued dependencies exist.

Example (Before 4NF - Multi-Valued Dependency):

Student_ID Course Hobby

101 Math Painting

101 Math Singing

101 Physics Painting

101 Physics Singing

Here, Course and Hobby are independent of each other but are related to Student_ID.

After 4NF (Separate the multi-valued attributes into two tables):


Student_Course Table Student_Hobby Table

Student_ID Hobby
Student_ID Course

101 Paintin
101 Math g

101 Physics 101 Singing

6. Fifth Normal Form (5NF or PJNF - Project Join Normal Form)

A table is in 5NF if:

● It is in 4NF.
● It cannot be decomposed further without losing information (i.e., no join dependency).

7. Sixth Normal Form (6NF) (Rarely Used)

● Deals with temporal databases (databases that track historical changes).


● Ensures no non-trivial join dependencies exist.

Normal Form Key Condition

1NF Only atomic values (no multi-valued attributes)

2NF No partial dependency (depends on whole primary key)

3NF No transitive dependency (depends only on primary key)

BCNF Every determinant is a candidate key

4NF No multi-valued dependency

5NF No join dependency

6NF Used in temporal databases

4(a).Discuss 1NF and 2NF with examples

First Normal Form (1NF)

A table is in 1NF if:

1. Atomicity: All attributes (columns) contain atomic (indivisible) values.


2. Uniqueness: Each column contains unique values for a particular row.
3. Uniqueness of Rows: Each row is uniquely identified.

Example (Non-1NF Table)

Student_ID Name Subjects

101 Alice Math, Science

102 Bob English

103 John Math, English

● The Subjects column has multiple values (Math, Science, etc.), which violates 1NF.

Converting to 1NF

To achieve 1NF, break the multi-valued attributes into separate rows:


Student_ID Name Subject

101 Alice Math

101 Alice Science

102 Bob English

103 John Math

103 John English

Now, each column contains atomic values, and there are no repeating groups.

Second Normal Form (2NF)

A table is in 2NF if:

1. It is in 1NF.
2. No Partial Dependency: A non-prime attribute (an attribute that is not part of the primary key) must depend
on the whole primary key, not just a part of it.

Example (1NF but Not 2NF)

Student_ID Subject Teacher

101 Math Mr. A

101 Science Mr. B

102 English Mr. C

103 Math Mr. A

103 English Mr. C

● The composite primary key here is (Student_ID, Subject).


● The column Teacher depends only on Subject, not on Student_ID, causing a partial dependency.

Converting to 2NF
To remove partial dependency, split the table into two:

Student_Subject Table (2NF) Subject_Teacher Table

Subjec Teache
Student_ID Subject t r

Math Mr. A
101 Math

Science Mr. B
101 Science
English Mr. C

102 English

103 Math

103 English

(2NF)Now, every non-key column depends fully on the primary key, ensuring 2NF.

4(b).How does BCNF differ from 3NF? Explain with an example.

Difference Between BCNF and 3NF

Third Normal Form (3NF) and Boyce-Codd Normal Form (BCNF) both eliminate transitive dependencies, but
BCNF is stricter than 3NF.

3NF (Third Normal Form)

A table is in 3NF if:

1. It is in 2NF.
2. No Transitive Dependency: Every non-prime attribute (a column not part of the primary key) must depend
only on the primary key.

Example (Table in 2NF but Not 3NF)

Student_ID Student_Name Course_ID Course_Name

101 Alice CSE101 DBMS

102 Bob CSE102 OS

103 John CSE101 DBMS


● Primary Key: (Student_ID, Course_ID)
● Problem: Course_Name depends on Course_ID, not on (Student_ID, Course_ID).
● Solution: Separate Course_Name into another table.

3NF Tables:

1. Student_Course Table

Student_ID Course_ID

101 CSE101

102 CSE102

103 CSE101

2.
Course Table

Course_ID Course_Name

CSE101 DBMS

CSE102 OS

Now, every non-key attribute depends only on the primary key.

BCNF (Boyce-Codd Normal Form)

A table is in BCNF if:

1. It is in 3NF.
2. For every functional dependency (X → Y), X should be a superkey.

Key Difference: In 3NF, a table can have a non-trivial functional dependency where a non-
superkey determines another non-key attribute. BCNF removes even this possibility.

Example (3NF but Not BCNF)

Professor Department Course

Prof. A CS DBMS

Prof. B CS OS
Prof. C EE Circuits

● Functional Dependency: {Professor, Course} → {Department}


● Candidate Keys: (Professor, Course)
● Problem: Department depends on Professor, which is not a superkey.

Converting to BCNF

Prof_Dept Table

Professor Department

Prof. A CS

Prof. B CS

Prof. C EE

Dept_Course Table

Department Course

CS DBMS

CS OS

EE Circuits

Now, every functional dependency holds on a superkey, ensuring BCNF.

Feature 3NF BCNF

Removes Partial Dependency? ✅ Yes ✅ Yes

Removes Transitive Dependency? ✅ Yes ✅ Yes


Ensures Every Determinant is a ❌ Not Always ✅ Yes
Superkey?

More Strict? No Yes

5(a).Explain 3NF

Third Normal Form (3NF) is a database normalization form that aims to reduce redundancy and dependency by
ensuring that every non-key attribute is only dependent on the primary key.

A table is in Third Normal Form (3NF) if:

1. It is in Second Normal Form (2NF).


2. It does not have transitive dependencies, meaning non-key attributes must depend only on the
primary key and not on other non-key attributes.

Transitive Dependency:

A transitive dependency occurs when a non-key attribute depends on another non-key attribute instead of
depending directly on the primary key.

Example of a Table Not in 3NF:

Consider the following Student table:

Student_ID Name Department HOD_Name

101 Alice CSE Dr. Smith

102 Bob ECE Dr. Brown

103 Charlie CSE Dr. Smith

Here, the HOD_Name depends on the Department, not directly on Student_ID. This is a transitive dependency.

Converting to 3NF

To remove the transitive dependency, we create a separate Department table:

Student Table

Student_ID Name Department

101 Alice CSE

102 Bob ECE


103 Charlie CSE

Department Table

Department HOD_Name

CSE Dr. Smith

ECE Dr. Brown

Now, HOD_Name depends only on Department, and all attributes in the Student table depend only on Student_ID.
This ensures 3NF.

Benefits of 3NF

● Eliminates transitive dependencies.


● Reduces data redundancy.
● Improves database consistency and integrity.

5(b).Consider the relation SUPPLIER (SNAME, STREET, CITY, STATE, TAX) with key on
SNAME and FD: STATE →TAX. Decompose the relation SUPPLIER into 3NF Relations.

Decomposition of SUPPLIER into 3NF Relations

Step 1: Given Relation

We have the relation:

SUPPLIER(SNAME,STREET,CITY,STATE,TAX)\text{SUPPLIER} (SNAME, STREET, CITY, STATE,


TAX)SUPPLIER(SNAME,STREET,CITY,STATE,TAX)

● Primary Key: SNAME


● Functional Dependency (FD): STATE → TAX

Step 2: Check for 3NF Violation

A relation is in 3NF if:

1. It is in 2NF (no partial dependency).


2. It does not have transitive dependency, meaning every non-key attribute should depend only on the
primary key.

Here, we observe:

● SNAME → STREET, CITY, STATE, TAX (Since SNAME is the primary key)
● STATE → TAX (This creates a transitive dependency because TAX depends on STATE, not directly on
SNAME.)

Since TAX is indirectly dependent on SNAME through STATE, this violates 3NF.

Step 3: Decomposition into 3NF


To remove the transitive dependency, we decompose the relation into two separate relations.

1. SUPPLIER Table (Holds supplier details)

SUPPLIER(SNAME,STREET,CITY,STATE)\text{SUPPLIER} (SNAME, STREET, CITY,


STATE)SUPPLIER(SNAME,STREET,CITY,STATE)

● Primary Key: SNAME


● Dependency: SNAME → STREET, CITY, STATE (No transitive dependency)

2. STATE_TAX Table (Stores tax information for each state) (STATE, TAX)

● Primary Key: STATE


● Dependency: STATE → TAX (No transitive dependency)

Step 4: Verify 3NF

● In SUPPLIER (SNAME, STREET, CITY, STATE), all non-key attributes (STREET, CITY, STATE) depend
only on SNAME, making it 3NF compliant.
● In STATE_TAX (STATE, TAX), TAX depends only on STATE, and STATE is the primary key, so it is also
in 3NF.

Final Decomposed Relations

1. SUPPLIER (SNAME, STREET, CITY, STATE)


○ Primary Key: SNAME
○ Functional Dependency: SNAME → STREET, CITY, STATE
2. STATE_TAX (STATE, TAX)
○ Primary Key: STATE
○ Functional Dependency: STATE → TAX

This decomposition ensures that the database is in Third Normal Form (3NF).

6.Explain about lossless-join and dependency preserving decompositions.

Lossless-Join and Dependency-Preserving Decompositions

When decomposing a relation into smaller relations, we aim to maintain two important properties:

1. Lossless-Join Decomposition (Ensures no loss of information when relations are joined)


2. Dependency-Preserving Decomposition (Ensures that all functional dependencies are preserved in at
least one decomposed relation)

1. Lossless-Join Decomposition

A decomposition is lossless if we can reconstruct the original relation by joining the decomposed relations
without any loss of data.

Definition:

A decomposition of a relation R into R1 and R2 is lossless if:

R1⋈R2=R

This ensures that no extra tuples are introduced, and no information is lost.

Condition for Lossless-Join

A decomposition R1 and R2 is lossless if:

R1∩R2→R1 or R1∩R2→R2
This means that the common attributes between R1 and R2 must act as a superkey in at least one of the
decomposed relations.

Example:

Consider the relation:R(A,B,C) with the functional dependency: A→B

If we decompose into:

1. R1(A,B)
2. R2(A,C)

The common attribute is A, and since A→B, A acts as a superkey for R1.
Thus, the decomposition is lossless.

2. Dependency-Preserving Decomposition

A decomposition is dependency-preserving if all functional dependencies (FDs) in the original relation are
maintained in at least one decomposed relation.

Definition:

If a relation RRR with a set of functional dependencies (FDs), F is decomposed into R1,R2,…,Rn, then the
decomposition is dependency-preserving if:

(F1∪F2∪⋯∪Fn)+=F+

where F+ is the closure of functional dependencies.

Example:

Consider the relation: R(A,B,C) with the functional dependencies: A→B,B→C

Decomposing into:

1. R1(A,B)
2. R2(B,C)

Here:

● A→B is preserved in R1
● B→Cis preserved in R2

Since all original dependencies are preserved, this is a dependency-preserving decomposition.

Comparison of Both Properties

Property Purpose

Lossless-Join Ensures no data is lost after decomposition and joining.

Dependency-Preserving Ensures all functional dependencies remain intact.

Ideal Decomposition:

A good decomposition should be:


● Lossless (to prevent information loss)
● Dependency-Preserving (to avoid extra joins when checking FDs)

However, sometimes we must choose between them, especially in higher normal forms.

7(a).Describe the multivalued dependencies

In database theory, a multivalued dependency (MVD) is a constraint that specifies that the presence of certain
tuples in a relation implies the presence of other tuples. Here's a breakdown:

Core Concepts:

● What it is:
○ An MVD exists when having a value for one attribute determines a set of values for another attribute,
and this set of values is independent of the values of other attributes in the relation.

○ It's a constraint between sets of attributes in a relation.
● Key Distinction from Functional Dependency (FD):
○ While an FD states that one attribute (or set of attributes) determines a single value of another
attribute, an MVD deals with situations where one attribute determines multiple independent values of
another attribute.
● Role in Normalization:
○ MVDs are crucial in database normalization, specifically in the context of Fourth Normal Form (4NF).
4NF aims to eliminate redundancy caused by MVDs.

Explanation and Example:

Imagine a database for a course. A course can have multiple assigned textbooks and multiple assigned instructors.
These two sets of information (textbooks and instructors) are independent of each other.

● If "Course" is one attribute, "Textbook" is another, and "Instructor" is a third, then:


○ A "Course" determines a set of "Textbooks."
○ A "Course" also determines a set of "Instructors."
○ However, the "Textbooks" and "Instructors" are independent of each other.

This scenario illustrates a multivalued dependency.

Key Points:

● MVDs help to identify and manage complex relationships within a database.



● They play a significant role in ensuring data consistency and minimizing redundancy.

● They are important for understanding and implementing 4th normal form within database design.

7(b).Discuss about 4NF with example

Fourth Normal Form (4NF) is a level of database normalization that builds upon Boyce-Codd Normal Form (BCNF).
Its primary goal is to eliminate redundancies caused by multivalued dependencies. Here's a breakdown:

Key Concepts:

● Foundation:
○ A relation (table) is in 4NF if it is already in BCNF.

○ It must have no non-trivial multivalued dependencies other than a candidate key.
● Multivalued Dependencies (MVDs):
○ These occur when an attribute determines multiple independent values for another attribute.
○ 4NF focuses on removing these dependencies to reduce redundancy.

● Purpose:
○ To further refine database design and prevent anomalies that can arise from MVDs.
Example:

Let's consider a scenario involving students, their hobbies, and the courses they are enrolled in.

● Initial Table (Not in 4NF):

| StudentID | Hobby | Course | | S1 | Painting | Math | | S1 | Painting | Physics | | S1 | Hiking | Math | | S1 |


Hiking | Physics | | S2 | Reading | Chemistry | | S2 | Reading | Biology |
● Problem:

○ A student can have multiple hobbies, and a student can enroll in multiple courses.
○ The hobbies and courses are independent of each other.
○ This leads to redundant data. For example, the student "S1" and their hobbies and courses are
repeated.
○ This table has multi valued dependencies. StudentID ->-> Hobby and StudentID ->-> Course.
● Solution (4NF):

○ To achieve 4NF, we decompose the table into two separate tables:


■ StudentHobbies Table:

| StudentID | Hobby | | S1 | Painting | | S1 | Hiking | | S2 | Reading |

■ StudentCourses Table:

| StudentID | Course | | S1 | Math | | S1 | Physics | | S2 | Chemistry | | S2 | Biology |

● Result:

○ Each table now contains only one multivalued dependency, which is based on the candidate key
(StudentID).
○ Redundancy is eliminated, and data integrity is improved.

In essence:

4NF ensures that independent multivalued facts are stored in separate tables, preventing unnecessary repetition of
data

8(a).What is the use of surrogate key?

Surrogate keys play a crucial role in database design, particularly in data warehousing, by providing a stable and
efficient way to identify records. Here's a breakdown of their uses:

Key Advantages and Uses:

● Stability:
○ Natural keys (keys derived from real-world data) can change. For example, a customer's name or
address might be used as a natural key, but these can be updated. Surrogate keys, once assigned,
remain constant, ensuring data integrity.

● Simplicity and Performance:
○ Surrogate keys are typically simple data types, such as integers. This makes them efficient for
indexing, joining tables, and performing queries, leading to improved database performance.

● Handling Complex Natural Keys:
○ Natural keys can be complex, involving multiple columns or lengthy strings. Surrogate keys provide a
single, concise identifier, simplifying database operations.

● Data Integration:
○ When integrating data from multiple sources, natural keys may conflict or have inconsistencies.
Surrogate keys provide a unified and consistent way to identify records across different systems.
● Historical Data Tracking:
○ In data warehousing, it's essential to track changes over time. Surrogate keys allow you to maintain
historical records even when natural key values change.

● Data Anonymization:
○ Surrogate keys can replace sensitive natural keys, such as social security numbers, to protect privacy
while still maintaining the ability to uniquely identify records.
● Decoupling from Business Logic:
○ By using surrogate keys, the database structure is decoupled from the business logic. This means
that changes to the business rules will have less of an impact on the database structure.

In essence, surrogate keys provide a reliable and efficient way to manage data relationships, especially in complex
database environments.

8(b).Explain about Join Dependency and 5NF

To understand 5NF, we must first grasp the concept of join dependencies. Here's a breakdown:

Join Dependency:

● Definition:
○ A join dependency (JD) exists when a relation can be reconstructed by joining certain of its
projections. In simpler terms, it means that a table can be losslessly decomposed into multiple
smaller tables, and those smaller tables can be joined back together to recreate the original table.

○ It's a generalization of multivalued dependencies.

● Purpose:
○ Join dependencies highlight complex relationships where data is dependent on combinations of
attributes, rather than just single attributes.

Fifth Normal Form (5NF):

● Definition:
○ A relation is in 5NF if it is in 4NF and every join dependency in it is implied by the candidate keys.

○ Essentially, 5NF aims to eliminate redundancy that cannot be removed by 4NF, focusing on very
complex join dependencies.

○ It is also known as Project-join normal form (PJ/NF).

● Purpose:
○ 5NF addresses situations where breaking down a table into smaller tables is necessary to avoid
redundancy, but those smaller tables must be able to be rejoined without losing any information. It is
the final normal form that is often discussed.
● Relationship with Join Dependency:
○ 5NF is directly related to join dependencies. A relation that has a non-trivial join dependency that is
not implied by candidate keys is not in 5NF.

○ So in essence, 5NF ensures that all join dependencies are a result of the candidate keys.

In simpler terms:

● Imagine a scenario where a complex relationship involves multiple attributes that must be considered
together. 5NF ensures that this relationship is broken down into its most basic components, preventing any
redundancy that could arise from storing those attributes together.

Key takeaway:

● 5NF is a high level of normalization, and it is less commonly used than lower normal forms.
It is most relevent in situations with very complex data relationships
UNIT 5
1(a).Describe the transaction states

In database management systems (DBMS), a transaction represents a single logical unit of work. To ensure data
integrity and consistency, transactions progress through various states during their lifecycle. Here's a breakdown of
the common transaction states:

Key Transaction States:

● Active:
○ This is the initial state. A transaction enters the active state when it begins execution.
○ During this state, the transaction performs read and write operations on the database.
○ Changes made in this state are typically stored in temporary memory or buffers.
● Partially Committed:
○ A transaction enters this state after it has executed its final operation.
○ At this point, the transaction has completed its logical processing, but the changes have not yet been
permanently written to the database.
○ There's still a possibility that the transaction might fail before the changes are made permanent.
● Committed:
○ A transaction reaches the committed state when all its operations have been successfully completed,
and its changes have been permanently recorded in the database.
○ Once a transaction is committed, its effects are durable, meaning they will survive system failures.
● Failed:
○ A transaction enters the failed state if any error or failure occurs during its execution.
○ This could be due to hardware failures, software errors, or violation of database constraints.
○ When a transaction fails, it cannot continue its normal execution.
● Aborted:
○ If a transaction enters the failed state, the DBMS initiates the abortion process.
○ During abortion, the DBMS rolls back the transaction, undoing any changes it made to the database.
○ This ensures that the database returns to a consistent state.
● Terminated:
○ This is the final state. A transaction enters the terminated state after it has either been committed or
aborted.
○ At this point, the transaction has completed its lifecycle, and the system is ready to process new
transactions.

In essence:

These transaction states are crucial for maintaining the ACID properties (Atomicity, Consistency, Isolation,
Durability) of database transactions, which are essential for reliable data management.

1(b).Describe the properties of transaction.

When discussing database transactions, the acronym ACID is fundamental. It represents the four key properties that
guarantee reliable transaction processing. Here's a breakdown:

ACID Properties:

● Atomicity:
○ This property ensures that a transaction is treated as a single, indivisible unit of work.
○ Either all operations within the transaction are completed successfully, or none of them are.
○ If any part of the transaction fails, the entire transaction is rolled back, 1 and the database returns to its
previous consistent state.
○ Essentially, it's the "all or nothing" principle.
● Consistency:
○ This property guarantees that a transaction moves the database from one valid consistent state to
another.
○ It ensures that the database adheres to all defined rules, constraints, and integrity conditions.
○ The transaction must preserve the database's integrity.
● Isolation:
○ This property ensures that concurrent transactions do not interfere with each other.
○ Each transaction appears to execute independently, as if it were the only transaction running.
○ This prevents data corruption and ensures that transactions do not see intermediate, uncommitted
changes made by other transactions.
● Durability:
○ This property guarantees that once a transaction is committed, its changes are permanent and will
survive even system failures, such as power outages or crashes.
○ Committed changes are written to persistent storage, ensuring that they are not lost.

Why These Properties Matter:

● The ACID properties are crucial for maintaining data integrity and reliability in database systems.
● They ensure that transactions are processed correctly, even in complex and concurrent environments.
● They provide a foundation for building robust and dependable database applications.

In summary, the ACID properties are essential for ensuring that database transactions are processed reliably and
accurately, safeguarding the integrity of the data.

2(a).Examine the need of concurrent executions.

Need for Concurrent Executions

In modern computing, concurrent execution refers to the ability of a system to execute multiple tasks
simultaneously. This is essential for improving system performance, resource utilization, and responsiveness. The
need for concurrent execution arises in various scenarios, including multi-user environments, parallel computing,
and real-time applications.

1. Efficient CPU Utilization

Without concurrency, a CPU may remain idle while waiting for input/output (I/O) operations to complete. By allowing
multiple processes or threads to execute simultaneously, the system ensures that the CPU is used efficiently,
minimizing idle time.

2. Improved System Throughput

Concurrency increases the number of tasks a system can process within a given time. Instead of executing tasks
sequentially, concurrent execution enables multiple tasks to run in parallel, leading to better system throughput.

3. Responsiveness in Interactive Systems

In applications like web browsers, operating systems, and real-time systems, concurrency ensures that multiple user
requests can be processed simultaneously. This enhances responsiveness, as users do not have to wait for one
task to finish before another begins.

4. Support for Multi-User Environments

In systems where multiple users interact simultaneously (e.g., databases, web servers), concurrent execution allows
multiple queries or transactions to be processed without significant delays, preventing bottlenecks.

5. Parallel Processing for Performance Boost

Modern processors have multiple cores that can handle multiple threads or processes simultaneously. Concurrency
enables efficient parallel execution, improving the performance of applications such as scientific computing, artificial
intelligence, and simulations.

6. Resource Sharing and Synchronization

Concurrency enables multiple processes to share resources like memory, files, and networks efficiently. However,
synchronization mechanisms (e.g., locks, semaphores) are required to prevent race conditions and ensure data
consistency.

7. Real-Time Application Needs

In real-time systems (e.g., autonomous vehicles, industrial automation, medical systems), concurrency ensures that
time-sensitive tasks are executed without delays, meeting strict deadlines for data processing.
2(b).Analyse the anomalies associated with interleaved execution.

Anomalies Associated with Interleaved Execution

Interleaved execution occurs when multiple processes or threads execute concurrently, with their instructions
interleaved over time. While this improves system efficiency, it can also lead to various anomalies that affect data
consistency, correctness, and program behavior. These anomalies are particularly significant in database systems,
operating systems, and multi-threaded applications.

1. Lost Update Anomaly

Occurs when two transactions or processes update the same data simultaneously, and one update is lost due to
interleaved execution.

Example:

● T1 reads a value (X = 10).


● T2 reads the same value (X = 10).
● T1 updates X to 15.
● T2 updates X to 20.

Since T2 was unaware of T1's update, the final value is X = 20, and T1's update to 15 is lost.

2. Dirty Read (Uncommitted Dependency)

Occurs when a transaction reads data that another transaction has modified but not yet committed. If the modifying
transaction is rolled back, the read transaction will have incorrect data.

Example:

● T1 updates X from 10 to 15 but has not committed.


● T2 reads X as 15 and uses it for further calculations.
● T1 rolls back, restoring X to 10.

Now, T2 has used incorrect data (15), leading to inconsistency.

3. Inconsistent Read (Non-Repeatable Read)

Happens when a transaction reads the same data multiple times, but another transaction modifies it between reads,
leading to inconsistent results.

Example:

● T1 reads X as 10.
● T2 updates X to 15 and commits.
● T1 reads X again and gets 15.

T1 expected X to be consistent throughout, but it changed due to interleaving.

4. Phantom Read

Occurs when a transaction retrieves a set of records based on a condition, but another transaction inserts, updates,
or deletes records, changing the result set.

Example:

● T1 runs a query to count records where salary > 50000.


● T2 inserts a new record with salary = 60000 and commits.
● T1 runs the same query again but gets a different count.

The number of records changed unexpectedly due to interleaved execution.

5. Deadlocks
Deadlocks occur when two or more transactions wait for each other to release resources, causing a permanent
block in execution.

Example:

● T1 locks resource A and requests resource B.


● T2 locks resource B and requests resource A.

Since neither can proceed without the other releasing the lock, a deadlock occurs.

6. Priority Inversion

Happens when a high-priority task is waiting for a low-priority task to release a resource, but the low-priority task
cannot complete due to system constraints, leading to delays.

Interleaved execution improves system utilization and responsiveness but introduces anomalies that can lead to
data inconsistencies, deadlocks, and unpredictable program behavior. To mitigate these issues, synchronization
techniques such as locks, transactions, isolation levels, and concurrency control mechanisms must be used.

3.Explain the following i) Serializability ii) Testing for Serializability iii) Recoverability

i) Serializability

Serializability is a key concept in database concurrency control that ensures the correctness of transactions
executed concurrently. A schedule (a sequence of interleaved operations from different transactions) is serializable
if it results in the same final database state as some serial execution (where transactions execute one after another
without interleaving).

Types of Serializability:

1. Conflict Serializability

○A schedule is conflict serializable if it can be transformed into a serial schedule by swapping non-
conflicting operations.
○ Two operations conflict if:
■ They belong to different transactions.
■ They access the same data.
■ At least one of them is a write operation.
2. View Serializability

○ A schedule is view serializable if it produces the same final result as a serial schedule, even if
conflicts exist.
○ This is less restrictive than conflict serializability but harder to test.

ii) Testing for Serializability

To determine if a schedule is serializable, we use the precedence graph (wait-for graph) method:

Steps to Test for Serializability Using Precedence Graph:

1. Create a directed graph where:

○Each node represents a transaction.


○A directed edge (T1 → T2) exists if T1 executes an operation (e.g., write) before T2 that
conflicts on the same data item.
2. Check for cycles in the graph:

○ If the graph has no cycles, the schedule is conflict serializable.


○ If a cycle exists, the schedule is not serializable.

iii) Recoverability
Recoverability ensures that a schedule maintains database consistency by allowing transactions to undo changes
safely in case of failure. A schedule is recoverable if a transaction commits only after all transactions from which it
has read data have also committed.

Types of Recoverable Schedules:

1. Recoverable Schedule (RC)

○ A transaction Tj should not commit before Ti if Tj has read uncommitted data from Ti.
2. Cascadeless Schedule (ACA - Avoids Cascading Aborts)

○ Prevents cascading rollbacks, where aborting one transaction forces others to abort.
○ No transaction should read uncommitted data from another transaction.
3. Strict Schedule (ST)

○Ensures strict two-phase locking (Strict 2PL), where no transaction reads or writes a data item
until the transaction that last modified it has committed.
4. Rigorous Schedule

○ The most restrictive, where locks are held until a transaction commits or aborts, preventing dirty
reads and cascading rollbacks.
● Serializability ensures that concurrent execution produces the same results as some serial order.
● Testing for serializability involves checking for cycles in the precedence graph.
● Recoverability ensures that transactions commit safely without leading to inconsistencies or cascading
failures.

4(a).Explain about two phase locking protocol.

Two-Phase Locking (2PL) Protocol

The Two-Phase Locking (2PL) Protocol is a concurrency control mechanism used in databases to ensure
serializability by managing how transactions acquire and release locks. It prevents anomalies like dirty reads, lost
updates, and non-repeatable reads.

Phases of Two-Phase Locking (2PL)

The protocol consists of two distinct phases for each transaction:

1. Growing Phase:

○ A transaction acquires locks but does not release any locks during this phase.
○ It can obtain shared (read) locks and exclusive (write) locks as needed.
○ This phase continues until the transaction reaches its lock point, where it acquires its last lock.
2. Shrinking Phase:

○ Once a transaction releases a lock, it cannot acquire new locks anymore.


○ It starts releasing locks gradually until the transaction either commits or aborts.

Example of Two-Phase Locking

Consider two transactions T1 and T2 operating on data items X and Y:

Transaction T1 Transaction T2

Lock-X (X)

Read(X)
Lock-X (Y)

Read(Y) Lock-X (Z)

Write(Y) Read(Z)

Unlock(Y) Lock-X (X) (Waits)

Unlock(X) Now T2 can proceed

● Growing phase: T1 acquires locks on X and Y.


● Shrinking phase: T1 releases locks in order. T2 must wait until T1 releases X.

This ensures serializability, as T2 cannot proceed until T1 finishes, preventing conflicts.

Types of Two-Phase Locking (2PL)

1. Strict Two-Phase Locking (Strict 2PL)

● All locks (both read and write) are held until the transaction commits or aborts.
● Prevents dirty reads and cascading rollbacks.
● Used in most database systems.

2. Rigorous Two-Phase Locking

● Even stricter than Strict 2PL—all locks are held until the transaction commits, ensuring strict
serializability.
● Provides better recoverability but increases waiting time.

3. Conservative Two-Phase Locking (Static 2PL)

● All required locks are acquired before the transaction starts execution (pre-locking).
● Avoids deadlocks but may cause delays due to waiting for lock availability.

Advantages of Two-Phase Locking

✔ Ensures serializability, preventing anomalies.


✔ Provides better consistency and concurrency control.
✔ Strict 2PL and Rigorous 2PL prevent dirty reads and cascading rollbacks.

Disadvantages of Two-Phase Locking

✘ Deadlocks can occur if multiple transactions wait for locks.


✘ Blocking delays since transactions must wait for lock release.
✘ Reduced concurrency, as transactions may need to wait longer before proceeding.

The Two-Phase Locking (2PL) Protocol ensures serializability by dividing a transaction into growing and
shrinking phases. Variants like Strict 2PL and Rigorous 2PL enhance safety but may reduce concurrency.
Deadlock detection and prevention mechanisms are often used alongside 2PL to manage its limitations effectively.

4(b).Determine when two operations in a schedule are said to be conflict?

Conflict in a Schedule
Two operations in a schedule are said to be in conflict if they meet the following three conditions simultaneously:

1. They belong to different transactions

○ The operations must be performed by different transactions (T1, T2, etc.).


2. They operate on the same data item

○ Both operations must access the same database item (e.g., X, Y, etc.).
3. At least one of the operations is a write (update)

○ If at least one operation is a write, a conflict occurs because the value of the data item may change.

Types of Conflicts

1. Read-Write Conflict (Uncommitted Dependency / Dirty Read)

○ A transaction reads a value that another transaction is updating but has not committed.
○ Example:
■ T1: Read(X)
■ T2: Write(X) (Conflict occurs)
2. Write-Read Conflict (Inconsistent Read / Non-Repeatable Read)

○ A transaction reads a value after another transaction has modified it.


○ Example:
■ T1: Write(X)
■ T2: Read(X) (Conflict occurs)
3. Write-Write Conflict (Lost Update)

○ Two transactions write to the same data item, causing one update to be lost.
○ Example:
■ T1: Write(X)
■ T2: Write(X) (Conflict occurs)

Example of Conflicting Operations in a Schedule

Time T1 T2

1 Read(X)

2 Write(X) (Conflict: Read-Write)

3 Write(X)

4 Read(X) (Conflict: Write-Read)

5 Write(X) Write(X) (Conflict: Write-Write)

Two operations are in conflict if they are from different transactions, access the same data item, and at least one of
them is a write operation. Conflicts lead to concurrency anomalies, which are managed using locking,
timestamp ordering, and concurrency control techniques.
5.Explain about the lock management in detail.

1. Introduction to Lock Management

Lock management is a concurrency control mechanism in databases and operating systems that ensures data
consistency and prevents conflicts in multi-user environments. A lock manager is responsible for granting and
releasing locks, ensuring that transactions follow correct synchronization protocols.

2. Types of Locks in Lock Management

A. Based on Data Access Type

1. Shared Lock (S-Lock or Read Lock)

○ Allows multiple transactions to read a data item.


○ No transaction can write to the data while an S-lock is held.
○ Example:
■ T1: Lock-S(X) → T2 can also acquire Lock-S(X) but cannot Lock-X(X)
2. Exclusive Lock (X-Lock or Write Lock)

○ Only one transaction can write to a data item at a time.


○ No other transaction can read or write while the X-lock is held.
○ Example:
■ T1: Lock-X(X) → T2 must wait until T1 releases the lock.

B. Based on Lock Duration

1. Short-Term (Transaction-Level) Locks

○ Released immediately after the operation (read/write) completes.


○ Used in optimistic concurrency control (OCC).
2. Long-Term (Transaction-Scoped) Locks

○ Held until the transaction commits or aborts.


○ Used in strict concurrency control mechanisms like Two-Phase Locking (2PL).

C. Based on Lock Granularity

1. Row-Level Locking

○ Locks only a single row in a table.


○ Provides high concurrency but increases locking overhead.
2. Table-Level Locking

○ Locks an entire table.


○ Ensures consistency but reduces concurrency.
3. Database-Level Locking

○ Locks the entire database (used for backup operations).


4. Page-Level Locking

○ Locks a group of rows (pages).


○ Balances concurrency and performance.

3. Locking Protocols

A. Two-Phase Locking (2PL)

● Transactions go through two phases:


1. Growing Phase: Acquire locks but cannot release them.
2. Shrinking Phase: Release locks but cannot acquire new ones.
● Ensures serializability but may lead to deadlocks.
B. Strict Two-Phase Locking (Strict 2PL)

● All locks are held until commit or rollback, preventing dirty reads.
● Prevents cascading rollbacks.

C. Rigorous Two-Phase Locking (Rigorous 2PL)

● Even stricter than Strict 2PL—all locks are held until the transaction commits.

D. Conservative Two-Phase Locking (Static 2PL)

● Pre-locking all required resources before execution to avoid deadlocks.

4. Lock Compatibility Matrix

Operation Shared Lock (S) Exclusive Lock (X)

Read (S-Lock) ✅ Allowed ❌ Not Allowed

Write (X-Lock) ❌ Not Allowed ❌ Not Allowed

● Shared locks allow multiple readers.


● Exclusive locks prevent all access until the lock is released.

5. Deadlocks in Lock Management

Deadlock Occurrence

● A deadlock occurs when two or more transactions wait indefinitely for each other to release locks.
● Example:
○ T1 locks A and requests B (held by T2).
○ T2 locks B and requests A (held by T1).
○ Both transactions wait indefinitely → Deadlock!

Deadlock Prevention Techniques

1. Wait-Die (Older transaction waits, younger transaction aborts).


2. Wound-Wait (Older transaction forces younger transaction to abort).
3. Timeout-Based Detection (Abort if lock wait exceeds threshold).

6. Lock Management System in Databases

A. Lock Table

● A data structure that maintains lock information:


○ Transaction ID
○ Lock Type (S/X)
○ Data Item Locked
○ Lock Status (Granted/Waiting)

B. Role of Lock Manager

● Checks lock compatibility.


● Manages lock requests and releases.
● Handles deadlocks and priority-based locking.
Lock management is essential for concurrency control and data consistency in databases. It ensures proper
synchronization using different lock types and protocols like Two-Phase Locking (2PL). However, lock-based
approaches can lead to deadlocks, which must be managed through prevention or detection mechanisms.

6.Explain about Timestamp based concurrency control.

Timestamp-Based Concurrency Control is a non-locking concurrency control mechanism used in databases


to ensure serializability without the need for locks. It assigns a unique timestamp to each transaction and
schedules operations based on these timestamps to avoid conflicts.

2. Key Concepts

A. Timestamp (TS)

● A timestamp (TS) is a unique identifier assigned to each transaction when it starts.


● The timestamp is usually based on the system clock or an increasing counter.
● If TS(T1) < TS(T2), then T1 started before T2.

B. Timestamp of Data Items

Each data item X has two timestamps:

1. Read Timestamp (RTS(X)): The largest timestamp of any transaction that successfully read X.
2. Write Timestamp (WTS(X)): The largest timestamp of any transaction that successfully wrote X.

3. Timestamp Ordering Protocol

The Timestamp-Ordering (TO) Protocol ensures serializability by enforcing the following rules:

A. Read Operation (Read(X))

A transaction T can read X only if:


TS(T)≥WTS(X)

If TS(T) < WTS(X) → Abort T (to prevent reading obsolete data).

● If allowed, RTS(X) is updated to max(RTS(X), TS(T)).

B. Write Operation (Write(X))

A transaction T can write X only if:


TS(T)≥RTS(X) AND TS(T)≥WTS(X)

If TS(T) < RTS(X) → Abort T (to prevent overwriting newer data).

● If allowed, WTS(X) is updated to TS(T).

4. Example of Timestamp-Based Concurrency Control

Step Transaction T1 (TS = 10) Transaction T2 (TS = 20) RTS(X) WTS(X)

1 Read(X) (Allowed) 10 0

2 Write(X) (Allowed) 10 10
3 Read(X) (Allowed) 20 10

4 Write(X) (Allowed) 20 20

5 Write(X) (Aborted, TS = 10 < WTS(X) = 20)` 20 20

● T1’s write is rejected in Step 5 because T2 (newer transaction) has already updated X.

5. Advantages and Disadvantages

✅ Advantages

✔ No deadlocks (since transactions are never waiting).


✔ Ensures serializability without using locks.
✔ Efficient for read-heavy workloads (since reads do not block).

❌ Disadvantages

✘ May cause frequent transaction aborts (if timestamps are not managed well).
✘ Not suitable for write-heavy workloads (as older transactions may get aborted).
✘ Requires system clock synchronization for accurate timestamp ordering.

6. Variants of Timestamp-Based Protocols

A. Thomas Write Rule

● If TS(T) < WTS(X), ignore the Write(X) instead of aborting the transaction.
● This reduces unnecessary aborts and improves performance.

B. Multiversion Timestamp Ordering (MVTO)

● Maintains multiple versions of a data item with different timestamps.


● Allows older transactions to read older versions, reducing conflicts.

Timestamp-based concurrency control ensures serializability without locks, using timestamps for ordering. While it
avoids deadlocks, it may cause frequent aborts. Variants like the Thomas Write Rule and Multiversion
Timestamp Ordering help optimize performance.

7.Explain about Optimistic concurrency control.

Optimistic Concurrency Control (OCC) is a concurrency control method that assumes conflicts are rare and
allows transactions to execute without acquiring locks. Instead of locking data items during execution, OCC
verifies conflicts at the validation phase before committing the transaction.

● Best suited for read-heavy workloads where conflicts are infrequent.


● Reduces the overhead of locking but may lead to transaction rollbacks.

2. Phases of Optimistic Concurrency Control

OCC operates in three phases:

A. Read Phase

● The transaction reads data from the database without acquiring locks.
● It performs all necessary computations and stores updates in a local workspace (buffer).
B. Validation Phase

● Before committing, the transaction is checked for conflicts.


● The system ensures that no other transaction has modified the data read by the current transaction.
● If conflicts are detected, the transaction is aborted and restarted.

C. Write Phase

● If the validation is successful, changes from the local workspace are written to the database.
● Otherwise, the transaction is aborted and restarted.

3. Validation Rules in OCC

A transaction T is validated by checking if it conflicts with other transactions T’ that have already committed. The
conditions to avoid conflicts are:

1. T’ finishes its write phase before T starts its read phase.

○ No conflict (T reads old values).


2. T’ finishes its write phase before T starts its validation phase.

○ No conflict (T does not read values being modified by T’).


3. T’ writes after T reads but before T validates.

○ Conflict! (T reads an inconsistent value → Abort T).

4. Example of OCC

Step Transaction T1 (Read → Compute → Transaction T2 (Read → Compute


Validate → Write) → Validate → Write)

1 Reads X = 100 Reads X = 100

2 Computes X = X + 10 (X = 110) Computes X = X × 2 (X = 200)

3 Validates (No conflict) → Writes X = Validates (Conflict! X changed) →


110 Aborted

● T2 is aborted because its read value (100) is no longer valid due to T1’s update (110).

5. Advantages and Disadvantages

✅ Advantages

✔ No locking overhead → Improves performance for read-heavy systems.


✔ No deadlocks → Since transactions don’t wait for locks.
✔ Better scalability → Suitable for distributed and high-concurrency systems.

❌ Disadvantages

✘ Frequent transaction rollbacks in write-intensive workloads.


✘ Wasted computation if a transaction is aborted in the validation phase.
✘ High CPU usage due to repeated validations and rollbacks.

6. Use Cases of OCC


● Databases with low write contention (e.g., analytics, reporting systems).
● Distributed databases where lock management is expensive.
● Blockchain and version control systems where conflicts are rare.

Optimistic Concurrency Control is efficient for systems where conflicts are rare. It avoids locking overhead and
deadlocks, making it ideal for read-heavy workloads. However, it can lead to frequent transaction rollbacks in
write-heavy environments.

8.Discuss about the deadlock prevention and detection.

A deadlock occurs in a database when two or more transactions are waiting indefinitely for resources locked by
each other, creating a cyclic dependency. Deadlock management is crucial for maintaining the performance and
reliability of a database system.

There are two main strategies to handle deadlocks:

1. Deadlock Prevention – Ensures that deadlocks never occur by controlling how transactions request
resources.
2. Deadlock Detection and Recovery – Allows deadlocks to occur but detects and resolves them when
they happen.

2. Deadlock Prevention

Deadlock prevention techniques ensure that the system never enters a deadlock state by following specific rules.
The main strategies include:

A. Wait-Die Scheme (Non-Preemptive)

● Based on timestamps assigned to transactions.


● If T1 (older transaction) requests a lock held by T2 (younger transaction), T1 waits.
● If T2 (younger) requests a lock held by T1 (older), T2 is aborted (it "dies") and restarted with a new
timestamp.
● Ensures that younger transactions do not wait for older ones, preventing cycles.

B. Wound-Wait Scheme (Preemptive)

● Similar to the Wait-Die scheme but with opposite logic.


● If T1 (older) requests a lock held by T2 (younger), T2 is aborted (it is "wounded") and restarted.
● If T2 (younger) requests a lock held by T1 (older), T2 waits.
● Ensures that older transactions never wait for younger ones.

C. No-Circular Wait Condition

● Imposes a total ordering of resource requests to avoid circular wait conditions.


● Transactions must request resources in a predefined order, preventing cyclic dependencies.

D. Timeout-Based Prevention

● If a transaction waits too long for a resource, it is automatically aborted and restarted.
● Works well in systems where deadlocks are rare.

3. Deadlock Detection and Recovery

Instead of preventing deadlocks, some systems allow deadlocks to occur and use detection mechanisms to
identify and resolve them.

A. Deadlock Detection

● Uses a Wait-for Graph (WFG) to represent transactions and their waiting dependencies.
● A cycle in the graph indicates a deadlock.
● The system periodically checks the WFG and detects cycles using algorithms like Depth-First Search
(DFS).
B. Deadlock Recovery

Once a deadlock is detected, the system must resolve it by aborting one or more transactions. Recovery strategies
include:

1. Transaction Rollback (Victim Selection)

○ Abort one or more transactions involved in the deadlock.


○ Prefer to abort younger transactions (less work done).
○ Consider transaction priority, resources used, or time spent.
2. Partial Rollback

○ Instead of aborting an entire transaction, roll back only the conflicting part.
3. Preempting Resources

○ Temporarily force a transaction to release resources and restart it later.

4. Example of Deadlock

Scenario:

● T1 locks X and wants Y.


● T2 locks Y and wants X.
● Both transactions wait for each other forever (deadlock).

Using Prevention:

● Wait-Die: If T1 is older, T2 dies and restarts.


● Wound-Wait: If T1 is older, T2 is aborted.

Using Detection:

● The system detects a cycle in the Wait-for Graph and aborts one transaction.

Deadlock handling is essential for maintaining database performance. Prevention techniques avoid deadlocks
entirely but may delay transactions. Detection and recovery allow deadlocks but require periodic checks and
rollback strategies. The choice depends on the system workload and performance requirements.

9(a).Discuss the implementation of Isolation.

Isolation is one of the four ACID properties (Atomicity, Consistency, Isolation, Durability) that ensures transactions
execute independently without interfering with each other. It prevents concurrent transaction anomalies such as
dirty reads, non-repeatable reads, and phantom reads.

● Isolation ensures that the intermediate states of a transaction are not visible to other transactions.
● It controls the way transactions interact in a multi-user environment.

2. Isolation Levels in DBMS

Isolation is implemented using different isolation levels, as defined by SQL standards. The higher the isolation
level, the stronger the data consistency but at the cost of performance.

Isolation Level Dirty Read Non-Repeatable Read Phantom Read Concurrency

Read Uncommitted Possible Possible Possible High

Read Committed Prevented Possible Possible Moderate


Repeatable Read Prevented Prevented Possible Moderate

Serializable Prevented Prevented Prevented Low


(Strictest)

A. Read Uncommitted (Lowest Isolation)

● Transactions can read uncommitted changes made by other transactions.


● Issues: Dirty Reads, Non-Repeatable Reads, and Phantom Reads can occur.
● Used when: High performance is needed, and data consistency is not critical (e.g., logging systems).

B. Read Committed

● A transaction can only read committed data (no dirty reads).


● Issues: Non-repeatable reads and phantom reads may still occur.
● Used in: Most databases like Oracle (default) and SQL Server (default).

C. Repeatable Read

● Ensures that if a transaction reads a value multiple times, it sees the same value (no non-repeatable
reads).
● Issues: Phantom reads can still occur.
● Used in: MySQL InnoDB (default), banking transactions.

D. Serializable (Highest Isolation)

● Strictest level; transactions execute sequentially as if they were serialized.


● Issues: Low concurrency, high locking overhead.
● Used in: High-security applications where consistency is more important than speed.

3. Implementation of Isolation in DBMS

A. Lock-Based Concurrency Control

● Uses locks (Shared & Exclusive) to control access to data.


● Transactions must acquire locks before accessing data.

🔹 Types of Locks:

1. Shared Lock (S-Lock) – Allows multiple transactions to read but not write the same data.
2. Exclusive Lock (X-Lock) – Only one transaction can read and write.

🔹 Implementation Methods:

● Two-Phase Locking (2PL):

○ Ensures serializability by following a growing and shrinking phase.


○ Issue: Can cause deadlocks.
● Strict Two-Phase Locking (Strict 2PL):

○ Holds all locks until the end of the transaction to prevent cascading aborts.

B. Timestamp-Based Concurrency Control

● Each transaction is assigned a timestamp (TS) at the start.


● Transactions are executed in timestamp order, ensuring consistency.

🔹 Rules:
1. If T1 (older) wants to write X, it waits if X is modified by T2 (newer).
2. If T2 (newer) wants to write X, but T1 (older) read it earlier, T2 is aborted.
● Used in: High-performance databases that avoid locks.

C. Multiversion Concurrency Control (MVCC)

● Multiple versions of data are maintained instead of locking.

● Transactions read old versions of data while others write new versions.

● Used in: PostgreSQL, MySQL InnoDB, Oracle.

● Advantage: High concurrency, no locking overhead.

● Disadvantage: More storage required for multiple versions

Isolation is crucial for consistency in concurrent transactions. It is implemented using:

1. Isolation Levels (SQL Standard) – Balances performance vs. consistency.


2. Locking Mechanisms (2PL, Strict 2PL) – Ensures serializability.
3. Timestamp-Based Methods – Orders transactions by timestamp.
4. MVCC (Multiversion Concurrency Control) – Allows high concurrency.

The best approach depends on the workload:

● OLTP systems prefer Read Committed or Repeatable Read.


● Highly concurrent systems use MVCC or Timestamp-Based Control.
● Critical transactions use Serializable Isolation.

9(b).Explain about Recovery and Atomicity.

In the context of database management systems (DBMS), "Recovery" and "Atomicity" are crucial concepts that
ensure data integrity and reliability. Here's a breakdown:

Atomicity:

● Definition:
○ Atomicity is one of the ACID properties (Atomicity, Consistency, Isolation, Durability) of database
transactions.
○ It ensures that a transaction is treated as a single, indivisible unit of work.
○ This means that either all operations within a transaction are completed successfully, or none of them
are. There's no partial execution.
● Importance:
○ Atomicity prevents inconsistent data states. If a transaction fails in the middle of its execution (due to
a system crash, for example), the database is rolled back to its previous consistent state.
○ This guarantees that data remains accurate and reliable.
● Example:
○ Consider a bank transfer from account A to account B. Atomicity ensures that either both the debit
from A and the credit to B occur, or neither occurs. If the system crashes after the debit but before the
credit, the system will roll back the debit, maintaining data integrity.

Recovery:

● Definition:
○ Recovery refers to the process of restoring a database to a consistent state after a system failure.
○ DBMSs employ various techniques to recover from failures, such as hardware crashes, software
errors, or power outages.
● Importance:
○ Recovery ensures that data is not lost or corrupted due to system failures.
○ It allows the database to resume operations from a known, consistent state.
● Key Techniques:
○ Log-based recovery:
■ This involves maintaining a log of all database changes.
■ In case of a failure, the log is used to undo or redo transactions, restoring the database to a
consistent state.
○ Checkpoints:
■ Checkpoints are points in time when the database's state is written to stable storage.
■ They reduce the amount of log data that needs to be processed during recovery.
● Relationship between Atomicity and Recovery:
○ Atomicity is a property that recovery mechanisms rely on.
○ Recovery procedures use logs to ensure that transactions are either fully applied or fully undone,
upholding the principle of atomicity.
○ In essence, Atomicity is a property that must be upheld, and recovery is the mechanism that allows
the database to uphold that property, even after system failures.

10(a).Discuss about failure classification.

Failure classification is essential in database systems to ensure reliability and consistency in the presence of errors.
Failures can occur due to various reasons such as system crashes, software bugs, or human errors. Proper
classification of failures helps in implementing recovery mechanisms and maintaining data integrity.

Types of Failures

1. Transaction Failures

○Occur when a transaction cannot complete its execution due to logical or system-related errors.
○Examples:
■ Deadlock detection and termination.
■ Logical errors (e.g., division by zero, constraint violations).
■ System-imposed aborts due to excessive resource usage.
2. System Failures

○Occur when the system crashes due to hardware or software issues, affecting the database’s normal
operations.
○ Characteristics:
■ The main memory (volatile storage) is lost, but secondary storage (disk) remains intact.
■ Requires recovery mechanisms like undo-redo logging to restore consistency.
○ Examples:
■ Power failure.
■ Operating system crash.
■ Memory corruption.
3. Media Failures


Occur when physical storage devices such as hard disks or SSDs are damaged.

Results in loss of stored data unless proper backups exist.

Examples:
■ Hard disk crash.
■ Bad sectors or corrupted storage blocks.
■ SSD wear-out.
4. Communication Failures

○Occur in distributed databases where network issues lead to incomplete transactions.


○Causes inconsistencies in multi-site database operations.
○Examples:
■ Network disconnection.
■ Message loss or corruption.
■ Server failure in a distributed database.
5. Application Failures

○ Occur due to errors in application logic that lead to incorrect data being processed.
○ Examples:
■ Software bugs.
■ Misconfigured transactions.
■ Incorrect user inputs leading to erroneous database operations.
Failure Recovery Mechanisms

To handle these failures, databases implement recovery mechanisms such as:

● Transaction Rollback: Used to undo changes made by an incomplete transaction.


● Checkpointing: Periodic saving of system state to reduce recovery time.
● Logging (Write-Ahead Logging - WAL): Ensures changes are recorded before they are applied.
● Shadow Paging: Maintains old copies of pages to recover from failures.

By classifying failures properly, database systems can apply suitable recovery strategies and ensure high availability
and data integrity.

10(b).Describe the ARIES recovery algorithm.

ARIES is a widely used recovery algorithm in database systems that ensures atomicity and durability in the
presence of failures. It follows a Write-Ahead Logging (WAL) approach and supports fine-grained concurrency
control.

Key Features of ARIES

1. Write-Ahead Logging (WAL):

○ Before applying any change to the database, a log record is written to stable storage.
○ Ensures that redo and undo operations can be performed correctly.
2. Repeating History During Redo:

○After a failure, ARIES repeats the exact history of the system by reapplying all operations from the
log.
○ Ensures that all committed transactions are recovered properly.
3. Logging Undo Operations:

○ Uses compensation log records (CLRs) to track undo operations.


○ Helps in handling repeated failures during recovery.

Phases of ARIES Recovery

The ARIES algorithm consists of three main phases after a failure:

1. Analysis Phase

● Reads the log to determine:


○ Active transactions at the time of failure.
○ Dirty pages (modified pages not yet written to disk).
○ The starting point for redo and undo operations.
● Identifies the last checkpoint to optimize recovery.

2. Redo Phase

● Reapplies all logged operations from the last checkpoint to reconstruct the database state.
● Ensures that all committed transactions are properly applied.
● Uses the log sequence number (LSN) to avoid redundant operations.

3. Undo Phase

● Reverses the effects of uncommitted transactions using CLRs.


● Ensures atomicity by rolling back incomplete transactions.
● If a failure occurs during undo, CLRs help resume the rollback process from where it left off.

Advantages of ARIES

● Efficient recovery by using a combination of checkpoints and WAL.


● Supports concurrency while ensuring data consistency.
● Handles multiple failures effectively using CLRs.
● Flexible logging mechanism suitable for various database architectures.

ARIES is a robust and efficient recovery algorithm that guarantees data consistency and durability in database
management systems. By following Write-Ahead Logging, Repeating History, and Logging Undo Operations, it
ensures reliable transaction recovery even in the presence of complex failures

11.Explain all the operations on B+ tree by taking a suitable example.

A B+ Tree is a self-balancing m-ary search tree used in database indexing and file systems. It efficiently supports
search, insert, delete, and range queries.

Key Properties of B+ Tree

1. All keys in internal nodes act as a separator for child nodes.


2. Leaf nodes store actual data and are linked together for range queries.
3. Balanced structure ensures logarithmic search, insert, and delete operations.

Example: B+ Tree of Order 3 (m = 3)

● Maximum keys per node = m - 1 = 2


● Minimum keys per node = ceil(m/2) - 1 = 1
● Maximum children per node = m = 3

Let's construct a B+ Tree for the following sequence of keys:


10, 20, 30, 40, 50, 60, 70, 80, 90

1. Insertion in B+ Tree

Step 1: Insert (10, 20, 30)

● The root starts as a leaf node.


● Since it can hold 2 keys (order = 3), we insert (10, 20, 30).

[10 | 20 | 30]

Step 2: Insert (40, 50)

● Since the leaf node can hold only 2 keys, it overflows when inserting 40.
● We split the node into two and promote 30 to a new root.

[30]

/ \

[10 | 20] [40 | 50]

Step 3: Insert (60, 70, 80, 90)

● Insert 60 into the right leaf [40 | 50].


● Insert 70, which causes overflow, leading to another split.
● Promote 60 to the root.

[30 | 60]

/ | \

[10 | 20] [40 | 50] [70 | 80 | 90]

2. Searching in B+ Tree

To search 50:
● Start at the root [30 | 60] → 50 is in the middle subtree.
● Move to [40 | 50] → Found 50!
Search takes O(log n) time due to the balanced structure.

3. Deletion in B+ Tree

Let's delete 70:

● Locate 70 in the rightmost leaf [70 | 80 | 90].


● Remove it → [80 | 90] (no underflow).

New structure remains valid:

[30 | 60]

/ | \

[10 | 20] [40 | 50] [80 | 90]

The B+ Tree maintains balance using splitting and merging operations, ensuring efficient O(log n) performance
for search, insert, and delete operations. It is widely used in databases for indexing and range queries due to its
linked leaf nodes for sequential access.

12.Explain about Static Hashing.

Static Hashing is a technique used in database management systems (DBMS) and file organization to store and
retrieve records efficiently using hash functions. In static hashing, the number of primary buckets (storage
locations) remains fixed throughout the lifespan of the database.

Working of Static Hashing

1. Hash Function:


A hash function H(K) = B maps a given key K to a specific bucket B.

Example: If the total number of buckets is 10, a simple hash function can be: H(K)=Kmod 10H(K) = K
\mod 10H(K)=Kmod10
○ For K = 25, the bucket assigned is H(25) = 25 % 10 = 5.
2. Bucket Structure:

○ Each bucket stores multiple records.


○ The number of buckets remains constant.

Operations in Static Hashing

1. Insertion

● Apply the hash function to find the target bucket.


● Store the record in that bucket.
● If the bucket is full, overflow handling is required (using chaining or overflow buckets).

Example: Insert keys {10, 22, 35, 40} into a hashing scheme with H(K) = K mod 10.

● H(10) = 0 → Store in Bucket 0


● H(22) = 2 → Store in Bucket 2
● H(35) = 5 → Store in Bucket 5
● H(40) = 0 → Bucket 0 is full, handle overflow

2. Search

● Compute H(K) to locate the bucket.


● Search for the key within that bucket.
● If overflow occurs, traverse the overflow chain.
Example: Searching for 35

● H(35) = 5 → Look in Bucket 5


● Found 35 directly in O(1) time.

3. Deletion

● Locate the record using H(K).


● Remove the record while maintaining overflow pointers (if any).

Example: Delete 40

● H(40) = 0 → Look in Bucket 0


● If it exists in overflow, update the pointers accordingly.

Advantages of Static Hashing

✔ Fast retrieval – O(1) lookup time in most cases.


✔ Efficient for small datasets with minimal changes.
✔ Simple to implement compared to dynamic hashing.

Disadvantages of Static Hashing

❌ Fixed bucket size – Leads to overflow if too many records are inserted.
❌ Wasted space – If records are fewer, some buckets remain unused.
❌ Poor scalability – If data grows, reorganization of the entire hash table is needed.

Static Hashing is efficient for small, stable datasets but struggles with scalability. For dynamic applications,
Dynamic Hashing (e.g., Extendible or Linear Hashing) is preferred.

You might also like