0% found this document useful (0 votes)
2 views

Module-5-3

Module 3 of CST 204 covers SQL Data Manipulation Language (DML), including writing queries for single and multiple tables, using JOINs, and understanding nested queries. It also introduces aggregation functions, grouping data with GROUP BY, and filtering groups with HAVING, along with the creation and use of views and assertions in SQL. Practical exercises are included to reinforce learning and application of these concepts.

Uploaded by

sky2022n
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Module-5-3

Module 3 of CST 204 covers SQL Data Manipulation Language (DML), including writing queries for single and multiple tables, using JOINs, and understanding nested queries. It also introduces aggregation functions, grouping data with GROUP BY, and filtering groups with HAVING, along with the creation and use of views and assertions in SQL. Practical exercises are included to reinforce learning and application of these concepts.

Uploaded by

sky2022n
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 81

Module 3

CST 204 DBMS


~

~ -

-
Part 1 - Syllabus
SQL DML (Data Manipulation Language)
● SQL queries on single and multiple tables
● Nested queries (correlated and non-correlated)
● Aggregation and grouping
● Views
● Assertions
● Triggers
● SQL data types
What We’ll Cover in This Session:

1. Introduction to SQL DML (Data Manipulation Language)


2. Writing SQL Queries on Single Tables
-

● SELECT, WHERE, ORDER BY, etc.


3. Writing SQL Queries on Multiple Tables
● JOINs (INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL OUTER
~

JOIN)
4. Practical Examples and Exercises
What is SQL DML?

● DML stands for Data Manipulation Language , a subset of SQL used


-

to retrieve, insert, update, and delete data in a database.

Key Commands in DML:


● SELECT: Retrieve data from one or more tables.
● INSERT: Add new records to a table.
● UPDATE: Modify existing records in a table.
~

● DELETE: Remove records from a table. -


Fi
Basic SQL Query Structure
~ ~
&

~ SELECT column1, column2, ... -

~ FROM table_name -
VWHERE condition -v
ORDER BY column ASC|DESC;

Key Components:
● SELECT: Specifies the columns to retrieve.
● FROM: Specifies the table(s) to query. From
● WHERE: Filters rows based on conditions. ~
● ORDER BY: Sorts the result set.
Example - Querying a Single Table

~ ~
v
~
SELECT EmployeeID, FirstName, LastName, Salary
FROM Employees ~
~
WHERE Salary > 50000
ORDER BY Salary DESC;
~

Output Explanation: ~ - -

● Columns selected: EmployeeID, FirstName, LastName, Salary.


● Filter applied: Salary > 50000. ~
● Sorted by: Salary in descending order.~
-
Querying Multiple Tables - Why Use JOINs?

Problem:
Data is often spread across multiple tables (e.g., Employees and Departments).
Solution:
Use JOINs to combine data from two or more tables based on related columns. -
Types of JOINs:
1. INNER JOIN : Returns matching rows from both tables.
2. LEFT JOIN : Returns all rows from the left table and matching rows from the right table.
3. RIGHT JOIN : Returns all rows from the right table and matching rows from the left
table.
4. FULL OUTER JOIN : Returns all rows when there is a match in either table.
INNER JOIN Example
Scenario:
Retrieve employee names and their department names.
Tables:

D
Employees: Contains EmployeeID, FirstName, LastName, DepartmentID. &
● Departments: Contains DepartmentID, DepartmentName.

Query: -
-
-
~ T
SELECT Employees.FirstName, Employees.LastName, Departments.DepartmentName
-
FROM Employees
- -
mem

INNER JOIN Departments -


-

W ON Employees.DepartmentID = Departments.DepartmentID;
-
-- ↑
- ↑
-

Output Explanation:
● Combines data from Employees and Departments using DepartmentID.
● Only matching rows are returned.

LEFT JOIN Example

Scenario:

Retrieve all employees and their department names (even if some employees don’t belong to a
department).
- ~
SELECT Employees.FirstName, Employees.LastName, Departments.DepartmentName
- -
-
FROM Employees
- -
-
-

~ LEFT JOIN Departments -


-
ON Employees.DepartmentID = Departments.DepartmentID;
-
-
A R -

Output Explanation: -

● All rows from Employees are included. ~


● If no matching department exists, DepartmentName is NULL.
V
-

B
Practical Exercise

Task 1:
Write a query to retrieve all employees with salaries greater than 60000, sorted by salary in ascending
order. ~ -
-
-

Task 2:
Write a query to retrieve all departments and the number of employees in each department.
Hint: - -

Use COUNT() and GROUP BY .


~
Recap and Key Takeaways

What We Learned in this Session:


1. SQL DML focuses on retrieving, inserting, updating, and deleting data.~
2. The SELECT statement is used to query data from tables.
-
3. Use JOINs to combine data from multiple tables. -
4. Types of JOINs: INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL OUTER JOIN.
- -

Next Steps: -

● Practice writing queries on your own.


● Explore advanced topics like nested queries and aggregation in the next session.
3.2 Nested Queries in SQL: Correlated and
Non-Correlated
Module 3
What We’ll Cover in This Session 3.2 ~

1. What are Nested Queries? >


2. Non-Correlated Subqueries ~
● Definition and Examples -
3. Correlated Subqueries -
● Definition and Examples
4. Differences Between Correlated and Non-Correlated -

Subqueries ~

5. Practical Exercises
-
What are Nested Queries?

Definition:
● A nested query (or subquery) is a query embedded within another SQL
query.
● It allows you to break down complex problems into smaller, logical steps.
Use Cases:
● Retrieve data based on conditions derived from another query.
● Perform calculations or comparisons using intermediate results. ~
Types of Nested Queries:
1. Non-Correlated Subqueries : Independent of the outer query.
2. Correlated Subqueries : Dependent on the outer query.
Non-Correlated Subqueries
-

Definition:
-
lit

-↑ &
● A non-correlated subquery executes independently of the outer query.
● The result of the subquery is computed once and used by the outer query.
Syntax Example:
~ ~

E
SELECT column1, column2

-
FROM table_name
WHERE column1 = (SELECT column1 FROM another_table WHERE condition);

Key Characteristics: ↑ - f
● Executes first and returns a single value or set of values.
● Outer query uses the result as a condition.
f
Example - Non-Correlated Subquery

Scenario:
Find employees whose salary is greater than the average salary.

Query:
-
- -

SELECT EmployeeID, FirstName, LastName, Salary


FROM Employees + -
WHERE Salary > (SELECT AVG(Salary) FROM Employees);
i
~
Explanation:
1. The subquery (SELECT AVG(Salary) FROM Employees) calculates the average salary.
2. The outer query retrieves employees with salaries greater than this average.
Correlated Subqueries

Definition:
● A correlated subquery depends on the outer query for its values.
● It executes once for each row processed by the outer query.
Syntax Example:
SELECT column1, column2
FROM table_name AS outer_table
WHERE column1 = (SELECT column1 FROM another_table WHERE condition AND
outer_table.column = another_table.column);

Key Characteristics:
● Executes repeatedly, once for each row in the outer query.
● Often used for row-by-row comparisons.
Example - Correlated Subquery

Scenario:
-
Find employees who earn more than the average salary in their department.
Query:

2
SELECT EmployeeID, FirstName, LastName, Salary, DepartmentID
FROM Employees AS E1
WHERE Salary > (
SELECT AVG(Salary)
FROM Employees AS E2
WHERE E1.DepartmentID = E2.DepartmentID
);
Explanation:
1. For each employee (E1), the subquery calculates the average salary of employees in the
same department (E2).
2. The outer query retrieves employees whose salary exceeds this average.
Differences Between Correlated and Non-Correlated Subqueries

d ↓

~ ~
~
~

~ -
Practical Exercise

Task 1:
Write a query to find employees whose salary is greater than the highest salary in the
"Sales" department.
-
- -

Hint:
Use a non-correlated subquery to find the maximum salary in the "Sales" department.~
Task 2:
-
Write a query to find departments where the total salary of employees exceeds 500,000.
Hint:
- -

Use a correlated subquery to calculate the total salary for each department.
Recap and Key Takeaways Session 3.2

What We Learned in Session 3.2:


-
1. Nested queries allow you to perform complex operations in SQL.
2. Non-Correlated Subqueries : Execute independently and return a single result. -
3. Correlated Subqueries : Depend on the outer query and execute repeatedly.-
4. Use non-correlated subqueries for simple conditions and correlated subqueries for
row-by-row comparisons.
-
Next Steps: -

● Practice writing nested queries to solidify your understanding.


-
● Explore advanced topics like aggregation and grouping in the next session.
Topic 3.3: Aggregation and Grouping
Module 3
What We’ll Cover in Session 3.3

1. Introduction to Aggregation Functions


↑ - - -

● COUNT, SUM, AVG, MIN, MAX


2. Grouping Data with GROUP BY

3. Filtering Groups with HAVING


~
4. Practical Examples and Exercises
What is Aggregation?

Definition:
● Aggregation involves summarizing or combining data into a single value or smaller
set of values.
-
-

● It’s commonly used to calculate totals, averages, counts, etc., from large datasets.
Key Aggregation Functions in SQL:
1. COUNT: Counts the number of rows. -
-
2. SUM: Calculates the total of numeric values.
3. AVG: Computes the average of numeric values. -
4. MIN: Finds the smallest value. -
5. MAX: Finds the largest value.-
Example - Basic Aggregation

Scenario:
Find the total salary paid to all employees.

Query:
SELECT SUM(Salary) AS TotalSalary -

FROM Employees;
-

Explanation:
● SUM(Salary) calculates the total salary of all employees.
AS TotalSalary renames the result column for better readability.

~
Grouping Data with GROUP BY -

Definition:
● The GROUP BY clause groups rows that have the same values in specified columns into summary
rows. ~
● It’s often used with aggregation functions to perform calculations on each group. ~
Syntax Example: - ~

SELECT column1, AGG_FUNC(column2)


=
--
FROM table_name -

GROUP BY column1;
-
Example:
Find the total salary paid in each department.
Query: ~
SELECT DepartmentID, ~
SUM(Salary) AS TotalSalary -
-
FROM Employees -
~
GROUP BY DepartmentID; ~

Explanation: -
● Groups employees by DepartmentID.
● Calculates the total salary for each department. ~
Filtering Groups with HAVING

Definition:
● The HAVING clause filters groups based on conditions, similar to how WHERE
filters rows.
● It’s used after GROUP BY to apply conditions to aggregated results.
Syntax Example: ~ ~

SELECT column1, AGG_FUNC(column2)


FROM table_name ~
GROUP BY column1 V

HAVING condition; -
Example:
Find departments where the total salary exceeds 500,000.
Filtering Groups with HAVING(Continued)

Query: ~ ~
~

SELECT DepartmentID, SUM(Salary) AS TotalSalary


-

FROM Employees
GROUP BY DepartmentID -

HAVING SUM(Salary) > 500000;


T -

Explanation:
● Groups employees by DepartmentID. ~ ~
● Filters only those departments where the total salary exceeds 500,000.
-
Combining GROUP BY and HAVING

Scenario:
Find departments with more than 10 employees and an average salary greater than 60,000.
- -
Query: -
~ ~ ~
SELECT DepartmentID, COUNT(*) AS EmployeeCount, AVG(Salary) AS AvgSalary
-
= -

FROM Employees -
GROUP BY DepartmentID ~
HAVING COUNT(*) > 10 AND AVG(Salary) > 60000;
-- &

- -
-

Explanation:
● Groups employees by DepartmentID .
● Counts the number of employees (COUNT(*) ) and calculates the average salary
(AVG(Salary) ) for each department.
● Filters only departments with more than 10 employees and an average salary > 60,000.
Practical Exercise

Task 1: ~

Write a query to find the total number of employees in each department.


-
Hint:
Use COUNT(*) with GROUP BY. ~
Task 2:
-

~
Write a query to find departments where the minimum salary is less than 40,000.
Hint: -

Use MIN(Salary) with GROUP BY and HAVING.


-
-
-
Recap and Key Takeaways Session 3.3

What We Learned Today:


1. Aggregation functions (COUNT , SUM , AVG , MIN , MAX ) summarize data. ~
2. GROUP BY groups rows with the same values into summary rows. a
3. HAVING filters groups based on conditions applied to aggregated results.
~

4. Combining GROUP BY and HAVING allows for powerful data analysis.


~
Next Steps:
● Practice writing queries with aggregation and grouping.
● Explore advanced topics like nested queries and views in future sessions.

Questions? Y
Topic 3.4: Views, Assertions (with Examples)

Module 3
What We’ll Cover in This Session 3.4

1. Introduction to Views
● What are Views? -
● Creating and Using Views ~
2. Types of Views
● Simple Views -
● Complex Views
3. Assertions in SQL -
● What are Assertions?
● Syntax and Examples
4. Practical Examples and Exercises
Grade
Creating and Using Views

Syntax: -
--
CREATE VIEW view_name AS

=
SELECT column1, column2, ...
FROM table_name
WHERE condition;
~
Example: -
Create a view to retrieve employee names and their departments.
Query:
=

CREATE VIEW EmployeeDepartmentView AS

&
SELECT Employees.FirstName, Employees.LastName, Departments.DepartmentName -

FROM Employees -
INNER JOIN Departments -
ON Employees.DepartmentID = Departments.DepartmentID; -
Using the View:
SELECT * FROM EmployeeDepartmentView;
-

mem
-
Explanation:
● The CREATE VIEW statement defines the view.
● The view can then be queried like a regular table.
Types of Views

Simple Views:
● Based on a single table.s
● Do not involve complex operations like joins or aggregations ~
Example :

-
-
-
- -
CREATE VIEW HighSalaryEmployees AS
SELECT FirstName, LastName, Salary
FROM Employees
WHERE Salary > 60000;
&
Types of Views

Complex Views:
● Involve multiple tables, joins, or aggregations.

Example :
-
CREATE VIEW DepartmentSalarySummary AS
SELECT DepartmentID, COUNT(*) AS EmployeeCount, AVG(Salary) AS
- -

AvgSalary
FROM Employees
GROUP BY DepartmentID;
-
Modifying and Dropping Views

Modifying a View:
-
Use CREATE OR REPLACE VIEW to update the definition of an existing view.
-

Example:
~ -
CREATE OR REPLACE VIEW HighSalaryEmployees AS

[
SELECT FirstName, LastName, Salary
FROM Employees
WHERE Salary > 70000;

Dropping a View:
Use DROP VIEW to delete a view. ~
Example:

-
DROP VIEW HighSalaryEmployees;
~
What are Assertions?

Definition:
● An Assertion is a database constraint that ensures a condition is always true across the entire database.
● If the condition becomes false, the database rejects the operation that caused the violation.
- -
Key Characteristics:
● Assertions are global constraints that apply to multiple tables.
● They are rarely supported directly in most RDBMS systems (e.g., MySQL does not support assertions).

Syntax (Theoretical):

~
CREATE ASSERTION assertion_name
CHECK (condition);
~
Example of Assertions

Scenario:
Ensure that no department has more than 20 employees. -
Query (Theoretical):

~
CREATE ASSERTION MaxEmployeesPerDepartment
CHECK (C

NOT EXISTS ( -

SELECT DepartmentID
FROM Employees
GROUP BY DepartmentID
HAVING COUNT(*) > 20 ~
)
);

Explanation:
● The assertion checks that no department exceeds 20 employees.
● If a transaction violates this condition, it will be rejected. j

-
Note:
Assertions are not widely implemented in modern RDBMS systems. Instead, similar functionality can often be achieved using triggers or application-level logic.
Practical Exercise

Task 1:

Hint: ----
Create a view to retrieve all employees with salaries greater than 50,000.

Use CREATE VIEW with a WHERE clause.


Task 2:
Write a query to use the view created in Task 1 to find employees whose last name starts with "S".
Hint:
Query the view like a regular table.
Recap and Key Takeaways of Session 3.4

What We Learned Today:


-

1. Views simplify complex queries and provide security by encapsulating logic. - ~


2. Views can be simple (single table) or complex (multiple tables, joins, aggregations).
-
3. Assertions enforce global constraints but are rarely supported in modern RDBMS systems.
4. Use views for reusable queries and assertions for enforcing business rules (if supported).
-
Next Steps:
● Practice creating and using views in your database projects.
● Explore advanced topics like triggers in the next session.
~
Topic 3.5: Triggers (with Examples), SQL Data Types

Module 3
What We’ll Cover in This Session 3.5

:
1. Introduction to Triggers ↑
● What are Triggers? - -

● Use Cases for Triggers


2. Creating and Using Triggers -
● Syntax and Examples -
3. SQL Data Types ~
-
● Common Data Types in SQL
● Choosing the Right Data Type ~
4. Practical Examples and Exercises
What are Triggers?

Definition: ~

● A Trigger is a database object that automatically executes a specified action when a


-

particular event occurs on a table or view.


-

● Events can include INSERT, UPDATE, or DELETE.


-

Key Characteristics:
- -

● Triggers enforce business rules, maintain data integrity, or log changes. v


● They execute before or after the triggering event.
Use Cases:
1. Automatically update related tables when data changes. -
2. Log changes to a history table. X
3. Enforce complex constraints that cannot be handled by standard constraints.
N
Eac

·
Trigger Syntax

Syntax: ~
CREATE TRIGGER trigger_name
{BEFORE | AFTER} {INSERT | UPDATE | DELETE}
=
=
ON table_name
~FOR EACH ROW
-

~ BEGIN
F -- Trigger logic here
~Explanation of Components:
END;

● trigger_name: Name of the trigger.


● ~ {BEFORE | AFTER}: Specifies whether the trigger runs before or after the event.-
● N {INSERT | UPDATE | DELETE}: Specifies the event that triggers the action.~
● FOR EACH ROW: Indicates the trigger operates on a row-by-row basis.
~

Example - Creating a Trigger

Scenario:
Log every deletion from the Employees table into a DeletedEmployees table.
Query:
~ CREATE TRIGGER LogEmployeeDeletion ~
~ AFTER DELETE ON Employees

~ FOR EACH ROW


~
BEGIN ~ ~ ~

.....
INSERT INTO DeletedEmployees (EmployeeID, FirstName, LastName,
DeletionDate) .....
VALUES (OLD.EmployeeID, OLD.FirstName, OLD.LastName, NOW());
END;
Explanation:
● The trigger fires after a row is deleted from the Employees table.
● It inserts the deleted row's details into the DeletedEmployees table, along with the current timestamp (NOW()).
Trigger Timing Options

1. BEFORE vs. AFTER:


● BEFORE: Executes the trigger logic before the triggering event (e.g., before inserting or updating a row). ~

ope
● AFTER: Executes the trigger logic after the triggering event. ~
Example - BEFORE Trigger:
Prevent inserting employees with salaries below 30,000.
Query:
CREATE TRIGGER CheckMinSalary - S
~ BEFORE INSERT ON Employees

~
FOR EACH ROW
BEGIN
IF NEW.Salary < 30000 THEN
SIGNAL SQLSTATE '45000'
SET MESSAGE_TEXT = 'Salary must be at least 30,000';
END IF;
~ END;

Explanation:
● The trigger checks the salary before insertion. ~
● If the condition fails, it raises an error using SIGNAL. ~
~
SQL Data Types
acti
Definition:
● Data Types define the kind of data that can be stored in a column. ~
Egor
● Choosing the right data type ensures efficient storage and prevents errors. ~
Common SQL Data Types:
1.

2.
Numeric: ~

String: &
~

INT, BIGINT, DECIMAL(p, s)


-

%
- ↑

● CHAR(n), VARCHAR(n), TEXT


3. Date/Time: -

● DATE, TIME, DATETIME, TIMESTAMP


4. Boolean: - - -
-

● BOOLEAN (or equivalent, depending on the RDBMS).


Choosing the Right Data Type: -

-
● Use INT for whole numbers, DECIMAL for precise decimals.

-
● Use VARCHAR for variable-length strings, CHAR for fixed-length strings.
&

● Use DATE or DATETIME for date-related data.


~
-
Example - Using Data Types

Scenario:
Create a table to store employee details with appropriate data types.
Query: ↑

CREATE TABLE Employees (


~ -
- EmployeeID INT PRIMARY KEY,
-

-FirstName VARCHAR(50), ~
- LastName VARCHAR(50), ~

-
Salary DECIMAL(10, 2), ~
~
HireDate DATE, ~

-IsActive BOOLEAN ~
);
Explanation: -
● EmployeeID : Integer for unique identification. -
● FirstName and LastName : Variable-length strings.
-
● Salary : Decimal with precision up to 2 decimal places.
● HireDate : Stores the date of hiring. -
● IsActive : Boolean to indicate active status. ↑
Practical Exercise

Task 1: >
-
Create a trigger that logs updates to the Salary column in the Employees table into a SalaryChangeLog table.
Hint: -

Use an AFTER UPDATE trigger and insert the old and new salary values into the log table.
Task 2:
Design a table to store customer information with appropriate data types. Include fields for CustomerID , Name , Email , Phone ,
and RegistrationDate .
Hint:
Choose numeric, string, and date/time data types as needed.
Recap and Key Takeaways of Session 3.5

What We Learned Today:

-
1. Triggers automate actions based on events like INSERT , UPDATE , or DELETE .
2. Triggers can enforce business rules, log changes, or maintain data integrity.
3. SQL Data Types define the kind of data stored in columns. Choosing the right type ensures
--
efficiency and accuracy.
4. Use triggers and data types effectively to build robust databases.
Next Steps: ~ -
● Practice creating triggers and designing tables with appropriate data types.
● Explore advanced topics like indexing in future sessions.
Questions ?
⑪ T

↑Q
Topic 3.6: Review of Terms - Physical and Logical Records, Blocking
Factor, Pinned and Unpinned Organization, Heap Files, Indexing

Module 3

D
What We’ll Cover in This Session:
*
1. Introduction to Physical and Logical Records ~
● Definitions and Differences
~
2. Blocking Factor ~
● What is it? Why is it important?
3. Pinned vs. Unpinned Organization
● Characteristics and Use Cases ~
4. Heap Files
● Structure and Behavior
~
5. Indexing
● Overview and Importance
~
6. Recap and Key Takeaways
·
Physical vs. Logical Records

Definition:
● Logical Record: A record as perceived by the user or application (e.g., a row in a table). ~
● Physical Record: How the logical record is stored on disk (e.g., blocks or pages).
-

~ ~

~
~
~ ~
-
~
Example:
A logical record might be a single employee record, while the physical record could store multiple employee records in a
block on disk. - -
-
Blocking Factor

Definition: ·D m
● The blocking factor is the number of logical records that can fit into a single physical block (or page)
on disk.
52

i
Formula:
Blocking Factor = Block Size / Record Size
-

Importance: ~
● Maximizes disk space utilization.
● Reduces I/O operations by reading/writing multiple records at once.
-
Example: ~

If a block size is 4 KB and each record is 1 KB, the blocking factor is 4 (4 records per block).
- -
- -
Pipp
Pinned vs. Unpinned Organization

Pinned Organization:
● Records are fixed in specific locations on disk.
~
I
e
● Useful for systems requiring predictable access patterns.
Unpinned Organization:

-
● Records can move around on disk (e.g., during reorganization).
● More flexible but may require additional overhead to track record locations.
Comparison: -

~ ~

·
~
#
Heap Files

Definition:
● A heap file is an unsorted collection of records stored on disk.
● Records are inserted without any specific order. ~
Characteristics: I
● Simple to implement. ~
● Fast for insertions since no sorting or indexing is required.
-

● Slow for searches since records must be scanned sequentially.


-

- -

Example:
-

A heap file might store employee records in the order they were added, regardless of
their IDs or names. ~
-

~
Indexing

Definition: F
● Indexing is a technique used to improve the speed of data retrieval operations on a database.
● An index is a data structure (e.g., B-tree, hash table) that maps keys to record locations.

-
Types of Indexes:
1. Primary Index: Built on the primary key.
2. Secondary Index: Built on non-primary key columns.
3. Clustered Index: Determines the physical order of data.
4. Non-Clustered Index: Stores a separate structure pointing to data.
Benefits of Indexing:
● Faster query performance. ~
● Enables efficient range queries and sorting. ~
Drawbacks:
~
● Increases storage requirements.
● Slows down insertions and updates due to index maintenance.
~ ~ -
-
Practical Exercise

Task 1:
Calculate the blocking factor-
● Block size = 8 KB
for a system where: T 1

Hint:
Record size = 2 KB 8 .
.
Use the formula:
plaintext
Copy
1
Blocking Factor = Block Size / Record Size
Task 2: -
~
Explain why heap files are inefficient for large-scale databases with frequent search operations.
Hint: - -
Consider the sequential scanning process and lack of ordering.
Recap and Key Takeaways 3.6

What We Learned Today:


1. Logical Records represent data from the user's perspective, while Physical Records focus on storage efficiency. d
2. ~ Blocking Factor determines how many records fit into a block, optimizing disk usage.
3. Pinned Organization fixes record locations, while Unpinned Organization allows flexibility.
-

4. - Heap Files store records without order, making them simple but inefficient for searches.
5. ~ Indexing improves query performance by organizing data for faster retrieval.
Next Steps:
● Explore advanced indexing techniques like B-Trees and hashing in future sessions.
-

Questions?
Topic 3.7: Single-Level Indices, Numerical Examples
Module 3
What We’ll Cover in This Session:

1. Introduction to Indexing
● Why do we need indices?
2. Types of Single-Level Indices

~
● Primary Index
~
● Secondary Index
3. Numerical Examples ~
● Calculating Index Size and Search Efficiency ~
4. Practical Exercises
5. Recap and Key Takeaways
What is Indexing?

Definition:
● An index is a data structure that improves the speed of data retrieval operations on a
database table. ~ - -

● It works like an index in a book, allowing quick access to specific data without scanning
the entire table. ~ -

Why Use Indexing?


● Faster query performance. ~
● Enables efficient range queries and sorting. ~
Drawbacks:
~
● Increases storage requirements.
● Slows down insertions and updates due to index maintenance. ~
Single-Level Indices

Definition:
⑭ ~
● A single-level index is an index that uses a single level of entries to map keys to
-
record locations. -

-
~
● It is simpler than multi-level indices but may not scale well for very large datasets.
Types of Single-Level Indices:
-

1. ~Primary Index: Built on the primary key of a table.


-

● Assumes records are stored in sorted order by the primary key.


2. ~Secondary Index: Built on non-primary key columns.~
-
-

● Can be created on unsorted data.


~
i
~
Primary Index

Definition:
● A primary index is built on the primary key of a table.
● It assumes that records are stored in sorted order by the primary key.
Structure:
● Each entry in the index contains:
● Key value (e.g., primary key).
● Pointer to the block where the record is stored.
Advantages:
● Efficient for range queries.
~
● Reduces the number of disk accesses. -
Example:Suppose we have a table with 1000 records sorted by EmployeeID. The primary index might look like this:

o - w
-
F i
-
Secondary Index

Definition:
● A secondary index is built on non-primary key columns.
● It allows indexing on fields other than the primary key.
Structure:
● Each entry in the index contains: ~
● Key value (e.g., a secondary column like LastName).
● Pointer to the record location.
Advantages: -
● Enables fast searches on non-primary key columns.

Y
● Useful for tables with multiple search criteria.
Disadvantages:
● Requires additional storage. ~
Y
● May slow down insertions and updates.
Example:Suppose we want to index employees by LastName. The secondary index might look like this:
-
Numerical Example - Primary Index

Scenario:
The same file has a secondary index on LastName, with each index entry being 20 bytes. Calculate:

F
1. Number of index entries.

E
2. Size of the secondary index.
Solution:
1.Number of Index Entries: -

Since the secondary index is built on all records: -


Number of Entries = Total Records = 10,000~
2.Size of Secondary Index: ~ -
~ ~
Index Size = Number of Entries × Entry Size = 10,000 × 20 = 200,000 bytes ≈ 200 KB
-
Practical Exercise

Task 1:
A file contains 5000 records, each of size 200 bytes. The block size is 4 KB. Calculate:
1. Blocking factor. -
2. Number of blocks needed to store the file.
0
3. Size of the primary index if each index entry is 12 bytes.
Task 2:
-

If the same file has a secondary index on DepartmentID , with each index entry being 18 bytes, calculate the size of the
secondary index.
Topic 3.8: Multi-Level Indices, Numerical Examples
What We’ll Cover in This Session:
1. Introduction to Multi-Level Indices ~
● Why do we need multi-level indices? ~
2. Structure of Multi-Level Indices~
● How they work and their advantages ~
3. Numerical Examples ~
● Calculating index levels and search efficiency
4. Practical Exercises
~
5. Recap and Key Takeaways
6.
What are Multi-Level Indices?

Definition:
● A multi-level index is an indexing technique that uses multiple levels of indices to map keys to record locations.

=
● It is used to overcome the limitations of single-level indices when dealing with very large datasets.
Why Use Multi-Level Indices?
● Reduces the number of disk accesses required for searching.
● Scales better than single-level indices for large datasets.
Key Characteristics:

I
● The top level contains pointers to the next level.
● The bottom level points to actual data blocks.

V
-

-
o
Structure of Multi-Level Indices
O

a
Explanation:
● A multi-level index is like a tree structure where each level reduces the search space.
-

● The top level (root) points to intermediate levels. - -

● The bottom level points to data blocks.


-

Example: -

Suppose we have a file with 1 million records. A multi-level index might look like this: -
-

~
~

-
Advantages of Multi-Level Indices

1. Efficient Search:
● Reduces the number of disk accesses by narrowing down the search space at each level.

-
2. Scalability:
● Handles very large datasets more effectively than single-level indices.
3. Flexibility:
● Can be combined with other indexing techniques like B-Trees.
Disadvantages:

Y
● Increased storage requirements due to multiple levels.
● More complex to implement and maintain.
~
~
Numerical Example - Multi-Level Index

Scenario:
A file contains 1,000,000 records, each of size 200 bytes. The block size is 4 KB. Each index entry

Fi
-
-
is 12 bytes. Calculate: - -
-
-
1. Blocking factor for data blocks.
2. Number of blocks needed to store the file. -
3. Number of levels in the multi-level index. -
Solution: ~

-
1.Blocking Factor: -
Blocking Factor = Block Size / Record Size = 4096 / 200 ≈ 20 records per block
- - --

2.Number of Blocks: ~ -
Number of Blocks = Total Records / Blocking Factor = 1,000,000 / 20 = 50,000 blocks
Numerical Example - Multi-Level Index
-

--
Scenario:
1. Number of levels in the multi-level index. ~
Solution 3:
Number of Levels in Multi-Level Index:
Each index block can hold: ~


Entries per Index Block = Block Size / Entry Size = 4096 / 12 ≈ 341 entries
-
Bottom Level: Points to 50,000 data blocks. Requires:
Blocks in Bottom Level = 50,000 / 341 ≈ 147 blocks
Intermediate Level: Points to 147 blocks. Requires:
-
Blocks in Intermediate Level = 147 / 341 ≈ 1 block
-

● Top Level (Root): Points to 1 block.


Total Levels: 3 (Root → Intermediate → Bottom).
- ~ -
17

Search Efficiency with Multi-Level Indices

F
Scenario:

-
How many disk accesses are required to retrieve a record using a multi-level index?
Explanation:

-
● Each level reduces the search space. ~

-
● For the previous example:
● Access the root block (1 access).
● Access the intermediate block (1 access).
● Access the bottom-level block (1 access).
● Access the data block (1 access). ~

DeDe
Total Disk Accesses: 4 ~
Comparison with Single-Level Index:
~
● Single-level index would require accessing all 50,000 blocks in the worst case.
-

&
Practical Exercise

Task 1:
... - -
0
A file contains 2,000,000 records, each of size 150 bytes. The block size is 8 KB. Each index entry is 16 bytes. Calculate:

-
1. Blocking factor for data blocks.
2. Number of blocks needed to store the file.
3. Number of levels in the multi-level index.
Task 2:
If the same file uses a single-level index, how many disk accesses are required in the worst case? Compare it with the
multi-level index.
Recap and Key Takeaways

What We Learned Today:


1. Multi-Level Indices reduce the number of disk accesses by organizing data into hierarchical levels.
~
2. They scale better than single-level indices for very large datasets.

--
3. Numerical Examples demonstrate how multi-level indices improve search efficiency.
4. Multi-level indices trade off increased storage for faster retrieval.
Next Steps:
● Explore advanced indexing techniques like B-Trees and hashing in future sessions.

Questions?
Thank you

You might also like