0% found this document useful (0 votes)
91 views165 pages

IT252 MNotes Complete 18042025

The document outlines the course plan for the Database Systems (IT252) at the National Institute of Technology Karnataka for the academic year 2024-25. It includes a comprehensive description of the course structure, evaluation criteria, and educational objectives, focusing on the design, development, and management of database systems. Key topics covered range from basic DBMS concepts to advanced topics like SQL, normalization, and ethical considerations in data management.

Uploaded by

Bhavya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views165 pages

IT252 MNotes Complete 18042025

The document outlines the course plan for the Database Systems (IT252) at the National Institute of Technology Karnataka for the academic year 2024-25. It includes a comprehensive description of the course structure, evaluation criteria, and educational objectives, focusing on the design, development, and management of database systems. Key topics covered range from basic DBMS concepts to advanced topics like SQL, normalization, and ethical considerations in data management.

Uploaded by

Bhavya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 165

NATIONAL INSTITUTE OF

TECHNOLOGY KARNATAKA
SURATHKAL
DEPARTMENT OF INFORMATION TECHNOLOGY
BACHELOR OF TECHNOLOGY
IN

h
INFORMATION TECHNOLOGY

et
ar
hp
DATABASE SYSTEMS
es

IT252 MINOR
SEMESTER IV
ur

COURSE PLAN 2024-25


PREPARED BY
Dr. SURESAN PARETH
r.S

April 18, 2025


D
Contents

h
et
ar
1 DATABASE SYSTEMS 1

2 Course Description 2

3 Course Structure
hp
3.1 Expected Learning Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
8

4 Evaluation Criteria 10

5 Introduction to DBMS 11
es
6 Types of DBMS 13

7 DBMS Architecture 15

8 Relational Database Management System (RDBMS) 19


ur

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
8.2 Key Features of RDBMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
8.3 Components of RDBMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
8.4 Concepts in RDBMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
r.S

9 Rules of Relational Databases 22


9.1 Relationships in RDBMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
9.2 Keys in RDBMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

10 Constraints in RDBMS 32
D

11 Structured Query Language (SQL) 35


11.1 Types of SQL Commands: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
11.2 SQL Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
11.3 Altering Table Structure in MySQL . . . . . . . . . . . . . . . . . . . . . . . . . . 40

12 ACID Properties in RDBMS 42

13 Transactions in RDBMS 43

14 RDBMS vs. NoSQL 52

i
15 SMART Health Management Database 53
15.1 Sample Data Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

16 AI-Powered Recruitment System 60


16.1 Data Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

17 Relational Algebra 71
17.1 Practical scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
17.2 Relational Algebra Queries and SQL Equivalents . . . . . . . . . . . . . . . . . 76

h
18 Cross Product and Join in RDBMS 80

19 JOIN Operations in AI Matching Recruitment System 84

et
20 AI Matching Recruitment System with Conventional DBMS Models 88

21 Server Hierarchy 94

ar
22 Functional Dependency in RDBMS 95
22.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
22.2 Practical Use Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

23 Functional Dependencies and Normal Forms hp 98

24 Normalization in Relational Databases 101

25 SQL-Based Normalization Walkthrough 104

26 SQL Inserts, Joins, and Updates Without Anomalies 107


es
27 SQL DELETE and Cascading Relationships 109

28 Lossless and Lossy Joins in RDBMS 111

29 Checking for Lossless vs Lossy Join in RDBMS 113


ur

30 Stored Procedures in Relational Databases 116

31 MySQL Stored Procedure Examples 120

32 MySQL Triggers 123


r.S

33 Stored Procedures and Triggers in AI-Driven Recruitment 126

34 Aggregate Functions in SQL 128

35 Aggregate Functions in AI-Driven Recruitment 130


D

36 Indexes in RDBMS 132

37 B-Trees in RDBMS 134

38 B+ Trees in RDBMS 139

39 Constructing a B+ Tree of Order 3 140

40 Views in RDBMS 143

ii
41 Views in AI-Driven Recruitment 146

42 Security and Backup in RDBMS 148

43 Transactions and Related Topics in RDBMS 150

44 Additional SQL Transaction-Based Questions and Answers 152

45 Transaction Management in AI-Driven Recruitment 156

h
46 Serialization in RDBMS 157

47 Serialization in AI-Driven Recruitment 159

et
ar
hp
es
ur
r.S
D

iii
1

h
DATABASE SYSTEMS

et
ar
Sub Code IT252
Evaluation Criteria ASSIGMENT + QUIZ + MIDTERM +
PROJECT+ ENDTERM
L-T-P 3-0-2
hp
Total Hours 35(T) +20(P)
Total Marks 100
Credits 04
Exam Hours 03
es
ur
r.S
D

1
2

h
Course Description

et
ar
Database Management Systems (DBMS) is a multidisciplinary field that intersects com-
puter science, information technology, and data engineering. It focuses on the design,
development, and management of systems that store, retrieve, and process large volumes
of structured data efficiently. DBMS serves as the backbone for applications across vari-
ous domains, including science, healthcare, finance, and e-commerce, enabling data-driven
hp
decision-making and operational excellence.
DBMS encompasses a wide range of concepts, from the fundamental principles of relational
models to advanced topics in distributed systems and big data management. Key areas in-
clude data modeling, normalization, indexing, query optimization, transaction management,
and database security. The field integrates tools and techniques from mathematics, pro-
es
gramming, and system design to ensure reliable, scalable, and high-performance data man-
agement.
The applications of DBMS are vast and diverse. They range from managing patient records
in healthcare systems to powering recommendation engines in e-commerce platforms, and
ur

from supporting financial transactions to enabling real-time analytics in IoT systems. Modern
DBMS platforms, such as MySQL, PostgreSQL, MongoDB, and Oracle, provide the foundation
for these applications, leveraging computational tools and frameworks like SQL, NoSQL, and
cloud-based database services.
This course introduces students to the foundational concepts and practical skills required to
r.S

design and implement database systems. Topics covered include data modeling using Entity-
Relationship diagrams, relational algebra, Structured Query Language (SQL), schema design,
normalization, indexing, and query optimization. Advanced topics, such as transactions, con-
currency control, distributed databases, and NoSQL systems, are also explored.
The course emphasizes real-world applications through hands-on projects and case studies.
Students will design and implement database solutions for complex scenarios, such as health-
D

care management, e-commerce, and logistics, using modern database tools. Ethical consid-
erations, including data privacy, security, and sustainability, are integral to understanding
the impact of database systems in a connected world.
Through this course, students will develop a deep understanding of database concepts and
acquire practical skills to manage data effectively. This knowledge equips them to solve com-
plex data challenges and prepares them for roles in academia, industry, and research, where
data plays a pivotal role in innovation and decision-making.
Course Educational Objectives (CEOs) for Database Management Systems (DBMS):

2
NITK, Release DB COURSE PLAN-2024-25

To provide students with a fundamental understanding of database principles and systems:


Introduction to database concepts, including data organization, relational models, and key
database management features.
Understanding the components of database systems, such as schema design, queries, and
transactions.
Learning the differences between relational and non-relational databases, and centralized
versus distributed systems.
To equip students with the knowledge and tools for designing and managing databases:

h
Study of data modeling techniques such as Entity-Relationship (ER) diagrams and schema
normalization.

et
Hands-on practice with database query languages, including SQL for relational databases
and NoSQL for non-relational databases.
Understanding indexing, query optimization, and performance tuning for efficient data re-
trieval.

ar
To enable students to apply database management techniques to practical applications:
Applications in healthcare, e-commerce, logistics, and financial systems, showcasing the ver-
satility of DBMS. hp
Case studies involving real-world scenarios like customer relationship management, resource
scheduling, and data analysis.
Integration of database systems with programming languages and frameworks for end-to-end
solutions.
To promote an understanding of ethical considerations and responsible data management
es
practices:
Recognizing the importance of data privacy, security, and integrity in database systems.
Exploring sustainable database management practices, including energy-efficient storage
and minimizing redundancy.
ur

To familiarize students with emerging trends and research in database systems:


Exploring advanced database architectures, such as distributed databases, data warehousing,
and cloud databases.
r.S

Understanding modern trends like NoSQL databases, graph databases, and real-time analyt-
ics.
Studying research challenges, including scalability, fault tolerance, and the integration of
DBMS with big data and machine learning systems.
Course Outcomes:
D

UCourse Outcomes (COs) for Database Management Systems (DBMS):


CO1: Develop a strong foundation in database management principles, including relational
models, data modeling, schema design, and SQL, for efficient data storage and retrieval.
CO2: Demonstrate proficiency in designing and implementing normalized databases, opti-
mizing query performance, and managing transactions for real-world applications.
CO3: Apply database management skills to practical applications across various domains,
integrating relational and non-relational databases with front-end and server-side technolo-
gies.

3
NITK, Release DB COURSE PLAN-2024-25

CO4: Understand the ethical implications of database management, emphasizing data secu-
rity, privacy, compliance, and sustainability in designing and managing database systems.

h
et
ar
hp
es
ur
r.S
D

4
3

h
Course Structure

et
ar
Week 1: Introduction to DBMS and ER Modeling
Topics:
• Introduction to Databases: - What is DBMS? - Types of databases (Relational, NoSQL,
etc.). hp
• Entity-Relationship (ER) Modeling: - Entities, attributes, and relationships. - Primary
keys, foreign keys, and cardinality.
Activities:
• Discuss SHMS(Smart Health Management System) as a real-world problem: - Define
es
entities (e.g., Patient, Doctor, Appointment). - Highlight relationships (e.g., One-to-Many
between Doctor and Appointment).
• Task: Create an ER diagram for SHMS.
Week 2: Schema Design and Normalization
ur

Topics:
• Schema Design: - Translating an ER diagram into relational tables.
• Normalization: - 1NF, 2NF, and 3NF concepts. - Avoid redundancy and dependency
issues.
r.S

Activities:
• Convert the SHMS ER diagram into a relational schema.
• Normalize SHMS tables: - Example: Split the Patient table into Patient and Pa-
tient_Contact.
D

• Task: Design schemas for all SHMS entities (Patient, Doctor, Appointment, etc.).
Week 3: SQL Basics (DDL, DML)
Topics:
• Data Definition Language (DDL): - Creating and modifying tables.
• Data Manipulation Language (DML): - Insert, update, delete, and retrieve data.
Activities:
• Hands-On:

5
NITK, Release DB COURSE PLAN-2024-25

– Create SHMS tables using CREATE TABLE.


– Populate tables with sample data using INSERT.
• Query basics: - Retrieve all appointments for a specific doctor.
– Example:

SELECT *
FROM Appointment
WHERE Doctor_ID = 101;

h
• Task: Write SQL queries for basic CRUD operations on SHMS.
Week 4: Advanced SQL Queries

et
Topics:
• Joins (INNER, LEFT, RIGHT).
• Aggregate functions (COUNT, SUM, AVG).

ar
• Grouping and sorting.
Activities:
• Teach joins with SHMS queries: - Example: Retrieve all confirmed appointments with
hp
patient and doctor details:

SELECT P.Name, D.Name AS Doctor_Name, A.Appointment_Date


FROM Appointment A
JOIN Patient P ON A.Patient_ID = P.Patient_ID
JOIN Doctor D ON A.Doctor_ID = D.Doctor_ID
WHERE A.Status = 'Confirmed';
es

• Practice grouping: - Count total appointments per doctor.


• Task: Students create advanced queries for SHMS.
Week 5: Indexing and Query Optimization
ur

Topics:
• Indexes: - Single-column and composite indexes.
• Query optimization techniques: - Analyzing query execution plans.
r.S

Activities:
• Add indexes to SHMS tables:
– Example:

CREATE INDEX idx_appointment_date ON Appointment(Appointment_Date);


D

• Analyze query performance before and after indexing.


• Task: Optimize SHMS queries with indexing.
Week 6: Stored Procedures
Topics:
• What are Stored Procedures? - Benefits and use cases.
• Writing stored procedures.

6
NITK, Release DB COURSE PLAN-2024-25

Activities:
• Create stored procedures for SHMS:
– Example: Book an appointment:

CREATE PROCEDURE BookAppointment(IN p_id INT, IN d_id INT, IN appt_date DATE)


BEGIN
INSERT INTO Appointment (Patient_ID, Doctor_ID, Appointment_Date, Status)
VALUES (p_id, d_id, appt_date, 'Confirmed');
END;

h
• Practice calling procedures:

et
CALL BookAppointment(1, 101, '2024-12-21');

• Task: Write stored procedures for common SHMS operations (e.g., updating patient
details).

ar
Week 7: Triggers
Topics:
• What are Triggers? - Types of triggers (BEFORE, AFTER).
• Automating actions with triggers. hp
Activities:
• Create SHMS triggers:
– Example: Log updates to patient contact:
es
CREATE TRIGGER LogPatientUpdate
AFTER UPDATE ON Patient
FOR EACH ROW
BEGIN
INSERT INTO Patient_Audit (Patient_ID, Old_Contact, New_Contact)
VALUES (OLD.Patient_ID, OLD.Contact, NEW.Contact);
ur

END;

• Test triggers by updating data in SHMS.


• Task: Students write triggers for other SHMS modules (e.g., notifications for low stock).
r.S

Week 8: Transactions
Topics:
• What are Transactions?
– ACID properties.
D

• Using BEGIN, COMMIT, and ROLLBACK.


Activities:
• Implement transactional workflows in SHMS:
– Example: Book an appointment and generate a bill in one transaction:

BEGIN;

INSERT INTO Appointment (...);


(continues on next page)

7
NITK, Release DB COURSE PLAN-2024-25

(continued from previous page)


INSERT INTO Billing (...);

COMMIT;

• Test failure scenarios and use ROLLBACK.


• Task: Write and test transactions for SHMS.
Week 9: Views

h
Topics:
• What are Views? - Advantages and limitations.

et
• Creating and using views.
Activities:
• Create SHMS views:

ar
– Example: Active appointments:

CREATE VIEW ActiveAppointments AS


SELECT * FROM Appointment WHERE Status = 'Confirmed';

• Query the views:


hp
SELECT * FROM ActiveAppointments WHERE Appointment_Date = '2024-12-21';

• Task: Students create views for billing summaries and patient histories.
Week 10: Testing and Deployment
es
Topics:
• Testing strategies: - Unit testing for procedures and triggers. - Integration testing for
transactions.
ur

• Deployment: - Best practices for moving from development to production.


Activities:
• Test SHMS with sample data: - Verify indexing improves query speed. - Check trigger
functionality.
r.S

• Discuss scalability options: - Partitioning and replication.


• Task: Prepare a presentation summarizing SHMS.

3.1 Expected Learning Outcomes


D

• Mastery of core DBMS concepts (ER modeling, normalization, SQL).


• Hands-on experience with advanced topics (stored procedures, triggers, indexing, trans-
actions).
• Ability to design and implement a functional database system.

3.1. Expected Learning Outcomes 8


NITK, Release DB COURSE PLAN-2024-25

Textbooks
1.Raghu Ramakrishnan, Johannes Gehrke, Database Management Systems, McGraw-Hill,
2014
2.R. Elmasri and S.B Navathe , Fundamentals of Database Systems,7th Ed., Pearson, 2017
3.Silberschatz, Korth A.F., Sudarshan S., Database System Concepts, 6th Ed., McGraw-
Hill,2010.

h
et
ar
hp
es
ur
r.S
D

3.1. Expected Learning Outcomes 9


4

h
Evaluation Criteria

et
ar
Assignments (10%);
Quiss(1) (10%);
Mid Term(20%);
Project (20%);
Final Exam (40%).
hp
Week Content
Quiz Covers Weeks 1-5
es
Midterm Examination Covers Weeks 1-7
Project Evaluation Abstract, Project Report and Presentation(2)
Endterm Examination Covers Weeks 1-15
ur
r.S
D

10
5

h
Introduction to DBMS

et
ar
A Database Management System (DBMS) is software that helps in the creation, orga-
nization, storage, retrieval, and management of data in databases. Think of it as an
advanced filing system that doesn’t just store data but also ensures that the data is secure,
consistent, and easily accessible.
Why is a DBMS Important?
hp
• Efficient Data Handling: Instead of manually managing large amounts of data, a DBMS
automates the process, making it faster and less error-prone.
• Centralized Management: All data is stored in a central system, making it easy to
update and maintain.
es
• Multiple User Access: DBMS supports many users working with the data at the same
time without issues.
Key Functions of DBMS:
ur

1. Data Storage & Retrieval


• Storage: A DBMS stores data in structured formats like tables, rows, and columns,
making it easy to organize.
• Retrieval: Users can quickly search, filter, and retrieve specific data using query
r.S

languages like SQL (Structured Query Language).


Example:

In a banking system, the DBMS helps store customer details and allows retrieval␣
,→of account information whenever needed.
D

2. Data Security & Integrity


• Data Security: Protects sensitive data from unauthorized access using passwords,
encryption, and access control mechanisms.
• Data Integrity: Ensures that the data is accurate, consistent, and reliable over
time.
Example:

11
NITK, Release DB COURSE PLAN-2024-25

In an e-commerce website, only authorized employees can modify product prices,␣


,→ensuring security, while data integrity ensures that customer orders aren’t␣

,→duplicated or lost.

3. Concurrency Control
• Definition: Allows multiple users to access or modify the database at the same time
without conflicts.
• Why It’s Needed: Imagine two people trying to book the same seat on a flight. The

h
DBMS ensures only one booking is confirmed, preventing double booking.
• Mechanisms Used: Locking, transaction management, and isolation levels.

et
4. Backup & Recovery
• Backup: DBMS can automatically create backups of the data at regular intervals.
• Recovery: If there’s a system crash, power failure, or hardware issue, the DBMS
helps restore the data to its last known good state.

ar
Example:

In hospitals, patient records are critical. Even if the server fails, DBMS␣
,→recovery features ensure data isn’t lost.

5. Data Independence
hp
• Definition: Data can be modified without affecting the programs or applications
that access it.
• Types of Data Independence:
es
– Logical Data Independence: Changes in the logical structure (like adding new
fields) don’t affect application programs.
– Physical Data Independence: Changes in physical storage (like moving data
from one server to another) don’t impact how applications access the data.
ur

• Why It’s Useful: This makes system upgrades or changes easy without needing to
rewrite application code.
r.S
D

12
6

h
Types of DBMS

et
ar
DBMS can be categorized into several types based on their data models and archi-
tecture:
1. Hierarchical DBMS
• Structure: Data is organized in a tree-like structure with parent-child relationships.
hp
Each parent node can have multiple child nodes, but each child node has only one
parent. This rigid hierarchy ensures a clear, organized flow of data.
• Advantages: Fast data retrieval, simple relationships, and efficient for handling
large volumes of data with clear hierarchies.
es
• Disadvantages: Inflexible structure, difficult to reorganize data, and complex to
handle many-to-many relationships.
• Example: IBM Information Management System (IMS), which is widely used in
banking and telecommunications industries.
ur

2. Network DBMS
• Structure: Uses a graph structure allowing multiple parent-child relationships,
forming a network model. This makes it more flexible than the hierarchical model,
as data can be accessed through various paths.
• Advantages: Supports complex relationships, faster traversal for certain queries,
r.S

and flexible connections.


• Disadvantages: Complex design, difficult to maintain, and requires skilled profes-
sionals to manage.
• Example: Integrated Data Store (IDS), used in industries requiring complex data
modeling like manufacturing and telecommunications.
D

3. Relational DBMS (RDBMS) (Most commonly used today)


• Structure: Data is stored in tables (relations) consisting of rows and columns. Each
table can be linked to others using keys (primary and foreign keys), allowing complex
data relationships.
• Query Language: Uses Structured Query Language (SQL) for querying, updating,
and managing data efficiently.
• ACID Properties: Ensures Atomicity, Consistency, Isolation, and Durability, making
transactions reliable.

13
NITK, Release DB COURSE PLAN-2024-25

• Advantages: Simple data modeling, easy to understand, powerful query capabili-


ties, and strong data integrity.
• Disadvantages: Performance issues with very large datasets, less suitable for un-
structured data.
• Examples: MySQL, PostgreSQL, Oracle, Microsoft SQL Server, SQLite.
4. Object-Oriented DBMS (OODBMS)
• Structure: Stores data as objects, similar to how objects are used in object-oriented
programming languages like Java or C++. Objects include both data and the meth-

h
ods (functions) that operate on the data.
• Advantages: Seamless integration with object-oriented programming, reusable

et
code, and better handling of complex data.
• Disadvantages: Less mature than RDBMS, limited community support, and com-
patibility issues with existing relational data.
• Example: ObjectDB, db4o, used in applications requiring complex data represen-

ar
tation, such as CAD systems.
5. NoSQL DBMS (Non-Relational DBMS)
• Purpose: Designed for handling unstructured and semi-structured data. It is opti-
hp
mized for high performance, scalability, and flexibility, making it ideal for Big Data
and real-time web applications.
• Applications: Widely used in social media platforms, IoT applications, e-commerce
sites, and distributed systems.
• Categories:
es
– Key-Value Stores: Data is stored as key-value pairs, enabling fast lookups. Ex-
amples: Redis, DynamoDB.
– Document Stores: Stores data in JSON, BSON, or XML documents, allowing
flexible schemas. Examples: MongoDB, CouchDB.
ur

– Column Stores: Data is organized in columns rather than rows, optimized for
analytical queries. Examples: Apache Cassandra, HBase.
– Graph Databases: Focuses on relationships between data points, using nodes
and edges to represent and query data efficiently. Examples: Neo4j, ArangoDB.
r.S

• Advantages: High scalability, flexible data models, optimized for distributed envi-
ronments.
• Disadvantages: Less consistency compared to RDBMS (eventual consistency
model), lack of standardized query language.
D

14
7

h
DBMS Architecture

et
ar
DBMS architecture defines the structure of a database system, focusing on how the data is
stored, processed, and accessed. It determines how clients, servers, and databases interact
with each other.
1. Single-Tier Architecture hp
• Description: In this architecture, the database and the application reside on the
same machine. Users interact directly with the database without any intermediary.
• Usage: Suitable for small-scale applications where the data load is minimal, and a
single user or a few users access the system.
• Advantages: - Simple to design and implement. - Fast data access as everything
es
runs on the same system.
• Disadvantages: - Limited scalability. - Poor security because the database is di-
rectly accessible.
• Example: Microsoft Access, where both the database and the application run on a
ur

single computer.
2. Two-Tier Architecture
• Description: This model consists of two layers: the client and the server. The client
(user interface) communicates directly with the database server using APIs or query
r.S

languages like SQL.


• Components: - Client Tier: Users interact with the database using applications or
interfaces. - Database Server Tier: Handles data storage, queries, and manage-
ment.
• Advantages: - Improved performance compared to single-tier systems. - Better
D

security as the database is on a separate server.


• Disadvantages: - Scalability issues when the number of clients increases. - High
dependency between client and server.
• Examples: MySQL, PostgreSQL, and Oracle setups with direct client-server com-
munication.
3. Three-Tier Architecture (Most Common in Enterprise Systems)

15
NITK, Release DB COURSE PLAN-2024-25

• Description: In this architecture, there are three distinct layers: the client layer,
application layer, and database layer. This separation provides better security, per-
formance, and scalability.
• Components: - Client Layer (User Interface): The front-end application that in-
teracts with the user. It sends requests to the application server. - Application
Layer (Business Logic): Processes the client requests, performs necessary com-
putations, and interacts with the database. This layer ensures that business rules
are enforced. - Database Layer (Data Storage & Management): Manages data
storage, retrieval, and transactions. It responds to requests from the application

h
layer.
• Advantages: - Enhanced security as the client cannot directly access the database.
- High scalability, suitable for large enterprise systems. - Easier maintenance and

et
updates as changes can be made in the middle tier without affecting the client or
database.
• Disadvantages: - Increased complexity in development and maintenance. - Re-
quires more resources compared to single-tier or two-tier architectures.

ar
• Examples: Enterprise applications like banking systems, e-commerce platforms,
and ERP systems.
Advantages of DBMS: hp
DBMS offers several benefits that enhance data management and system efficiency:
Data Consistency: Ensures data remains accurate and consistent across the database. This
is achieved through integrity constraints, normalization, and transaction management, pre-
venting anomalies and redundant data.
Security & Authorization: Provides role-based access control, allowing administrators to
es
define who can access, modify, or delete data. It includes encryption, authentication, and
authorization mechanisms to protect sensitive information.
Efficient Query Processing: Optimizes data retrieval using indexing, query optimization
techniques, and advanced search algorithms. This reduces the time required to fetch large
ur

datasets and improves application performance.


Backup & Recovery: Offers automated backup and robust recovery features to protect
against data loss due to hardware failures, software crashes, or natural disasters. This en-
sures business continuity and data availability.
r.S

Concurrency Control: Supports multiple users accessing the database simultaneously with-
out conflicts. This is managed through locking mechanisms, isolation levels, and transaction
controls to maintain data integrity in multi-user environments.
Disadvantages of DBMS
Despite its many advantages, DBMS has some limitations that organizations should consider:
D

High Cost: Implementing a DBMS can be expensive due to licensing fees, hardware re-
quirements, and ongoing maintenance costs. Large-scale systems also require investments
in data storage, servers, and network infrastructure.
Complexity: DBMS requires skilled database administrators (DBAs) to manage, configure,
and optimize the system. This includes tasks like performance tuning, security management,
and data migration, which can be complex and resource-intensive.
Performance Overhead: DBMS can introduce performance overhead compared to simple
file-based systems, especially when handling small datasets. This is due to additional layers

16
NITK, Release DB COURSE PLAN-2024-25

of abstraction, security checks, and transaction processing, which can slow down operations
in lightweight applications.
DBMS vs. File System:
The following table highlights the key differences between DBMS and File System based on
critical features:

Feature DBMS File System


Data Stores structured data in tables with rows Stores data in flat files without

h
Stor- & columns predefined formats
age
Secu- Provides user authentication & authoriza- No built-in security mechanisms

et
rity tion mechanisms
Redun- Eliminates redundancy via normalization High redundancy as data may be
dancy techniques duplicated across files
Con- Supports multiple users simultaneously Limited concurrency support with

ar
cur- through transaction management risk of conflicts
rency
Query- Uses SQL for efficient querying & data ma- Requires manual searching with-
ing nipulation out advanced query tools

Popular DBMS Software:


hp
DBMS software is categorized into Relational DBMS (RDBMS) and NoSQL DBMS based on
data models:
Relational DBMS (RDBMS):
es
MySQL:
Open-source and widely used for web applications.
Known for reliability, performance, and ease of use.
ur

Popular with PHP-based applications like WordPress.


PostgreSQL:
Advanced, open-source, and ACID-compliant.
Supports complex queries, foreign keys, triggers, and full-text search.
r.S

Preferred for analytical and geospatial applications.


Oracle DB:
Enterprise-level DBMS with high scalability and robust performance.
Supports complex transactions, distributed databases, and large-scale applications.
D

Widely used in finance, banking, and corporate sectors.


Microsoft SQL Server:
Microsoft’s RDBMS for enterprise applications.
Integrated with Microsoft products, providing advanced data analytics and reporting tools.
Suitable for business intelligence and data warehousing.
NoSQL DBMS:

17
NITK, Release DB COURSE PLAN-2024-25

MongoDB:
Document-based DBMS storing data in JSON-like BSON format.
Highly flexible and scalable, ideal for big data applications.
Used in content management systems, real-time analytics, and IoT applications.
Redis:
In-memory key-value store used for caching, session management, and real-time analytics.
Extremely fast due to in-memory data processing.

h
Supports data structures like strings, hashes, lists, sets, and sorted sets.
Cassandra:

et
Distributed, column-oriented NoSQL database.
Designed for high availability and scalability in large-scale applications.
Commonly used by tech giants for handling massive amounts of data.

ar
Neo4j:
Graph-based DBMS optimized for managing complex relationships.
Uses graph structures with nodes, edges, and properties.
hp
Ideal for social networks, recommendation engines, and fraud detection systems.
es
ur
r.S
D

18
8

h
Relational Database Management System (RDBMS)

et
ar
8.1 Introduction

A Relational Database Management System (RDBMS) is a type of database management


system that stores data in structured tables (relations). It follows the relational model,
hp
organizing data into rows and columns and enforcing relationships between them using keys.
Examples of RDBMS Software:
• MySQL
• PostgreSQL
es
• Oracle Database
• Microsoft SQL Server
• SQLite
ur

8.2 Key Features of RDBMS


r.S

• Data stored in tables – Organized into rows (records) and columns (attributes).
• Structured Query Language (SQL) – Used for querying and managing the database.
• Relationships – Tables can be linked using Primary Keys (PK) and Foreign Keys
(FK).
• ACID Compliance – Ensures reliable transactions with Atomicity, Consistency, Iso-
D

lation, and Durability.


• Data Integrity – Enforces constraints like Primary Key, Foreign Key, Unique, Check.
• Scalability – Can handle large amounts of structured data efficiently.

19
NITK, Release DB COURSE PLAN-2024-25

8.3 Components of RDBMS

• Tables (Relations) – Collections of records organized into rows and columns.


• Schema – Defines the database structure (tables, columns, data types).
• Indexes – Speed up data retrieval.
• Views – Virtual tables created from queries.
• Stored Procedures – Predefined SQL functions for complex operations.

h
• Triggers – Automated execution of SQL when certain conditions are met.
• Transactions – Ensures data consistency using ACID properties.

et
8.4 Concepts in RDBMS

ar
Schema:
A Schema is the structure of a database that defines how data is organized. It includes
definitions of tables, fields, relationships, constraints, indexes, views, and other elements.
hp
• A database schema is like a blueprint for organizing data.
• It does not store data but defines its organization.
Example:

CREATE SCHEMA University;


es
CREATE TABLE University.Students (
student_id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100) UNIQUE
);
ur

Entity:
An Entity is any real-world object that has attributes and can be represented in a database.
• Entities have attributes that describe their properties.
r.S

• Entities can be classified as strong entities and weak entities.


Example:
A Student is an entity with attributes student_id, name, and email.
Relation:
D

A Relation in RDBMS is a table that represents an entity.


• Each row (tuple) in a table represents an instance of an entity.
• Each column (attribute) represents a characteristic of the entity.
Example:

8.3. Components of RDBMS 20


NITK, Release DB COURSE PLAN-2024-25

CREATE TABLE Students (


student_id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100) UNIQUE
);

Relation Schema:
A Relation Schema is the structure of a relation (table), which includes:
• The name of the relation (table name)

h
• The attributes (columns) in the relation

et
• The data types of attributes
Example:
For the Students relation, the schema is:

ar
Students(student_id: INT, name: VARCHAR(100), email: VARCHAR(100))
Weak Entity:
A Weak Entity is an entity that cannot be uniquely identified by its own attributes alone
and relies on a Strong Entity through a Foreign Key.
hp
• A weak entity has a partial key.
• It must have a relationship with a strong entity.
Example:
A Dependent entity in an Employee-Dependent relationship:
es
CREATE TABLE Employee (
emp_id INT PRIMARY KEY,
name VARCHAR(100)
);
ur

CREATE TABLE Dependent (


dep_id INT,
emp_id INT,
dep_name VARCHAR(100),
PRIMARY KEY (dep_id, emp_id),
r.S

FOREIGN KEY (emp_id) REFERENCES Employee(emp_id)


);
D

8.4. Concepts in RDBMS 21


9

h
Rules of Relational Databases

et
ar
Codd’s Rules for Relational Database Management Systems (RDBMS):
Codd’s 12 rules, proposed by Dr. Edgar F. Codd, define the requirements for a database
management system to be considered truly relational.
These rules serve as a benchmark to evaluate the functionality of relational database systems.
hp
Rule 1: The Information Rule:
Description: All information in a database is represented explicitly using values in tables.
Example: Customer data like name, address, and phone number is stored in rows and columns
within tables, not in proprietary formats.
es
Use Case: Storing Customer Data in a Relational Format
Scenario: An e-commerce platform needs to store customer details such as name, email, and
address.
Implementation in MySQL:
ur

CREATE TABLE Customer (


Customer_ID INT PRIMARY KEY,
Name VARCHAR(100),
Email VARCHAR(100),
r.S

Address TEXT
);

INSERT INTO Customer (Customer_ID, Name, Email, Address)


VALUES (1, 'Alice', '[email protected]', '123 Elm Street');

Outcome: All data is stored explicitly in tabular form, ensuring adherence to this rule.
D

Rule 2: Guaranteed Access Rule:


Description: Every piece of data must be accessible without ambiguity, using a combination
of table name, primary key, and column name.
Example: To retrieve a customer’s name, you use SELECT Name FROM Customer WHERE
Customer_ID = 101.
Use Case: Retrieving Specific Customer Information
Scenario: Retrieve the email address of a customer with a specific ID.

22
NITK, Release DB COURSE PLAN-2024-25

Implementation in PostgreSQL:

SELECT Email
FROM Customer
WHERE Customer_ID = 1;

Outcome: Data is uniquely accessible using the table name (Customer), primary key (Cus-
tomer_ID), and column name (Email).
Rule 3: Systematic Treatment of NULL Values:

h
Description: NULL values (representing missing or inapplicable information) must be sys-
tematically handled.

et
Example: A NULL value in the Phone_Number column means the customer’s phone number
is unknown, but it must not cause unexpected errors in queries.
Use Case: Handling Missing Data
Scenario: A customer doesn’t provide a phone number during registration.

ar
Implementation in Oracle:

CREATE TABLE Customer (


Customer_ID INT PRIMARY KEY,
Name VARCHAR(100), hp
Phone_Number VARCHAR(15)
);

INSERT INTO Customer (Customer_ID, Name, Phone_Number)


VALUES (1, 'Bob', NULL);
es
Outcome: The Phone_Number column contains NULL, representing missing data systemat-
ically.
Rule 4: Dynamic Online Catalog Based on the Relational Model:
Description: The database’s structure (metadata) must also be stored as tables, allowing
ur

users to query it using SQL.


Example: The INFORMATION_SCHEMA in MySQL lets users query database structure like
tables and columns.
Use Case: Querying Metadata
r.S

Scenario: Retrieve all table names in the database.


Implementation in MySQL:

SELECT TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = 'my_database';
D

Outcome: Metadata is accessible through SQL queries, ensuring compliance with this rule.
Rule 5: Comprehensive Data Sub-Language Rule:
Description: The system must support a single, comprehensive language (like SQL) for all
operations, including querying, updating, and defining data.
Example: SQL is used for creating tables (CREATE), querying data (SELECT), and modifying
data (UPDATE).
Use Case: Managing Data with a Single Language

23
NITK, Release DB COURSE PLAN-2024-25

Scenario: Create a table, insert data, and query it using SQL.


Implementation in PostgreSQL:

CREATE TABLE Orders (


Order_ID INT PRIMARY KEY,
Customer_ID INT,
Total_Amount DECIMAL(10, 2)
);

INSERT INTO Orders (Order_ID, Customer_ID, Total_Amount)

h
VALUES (101, 1, 250.75);

SELECT * FROM Orders WHERE Customer_ID = 1;

et
Outcome: All database operations (creation, manipulation, querying) are performed using
SQL.
Rule 6: View Updatability Rule:

ar
Description: Any view (virtual table) derived from base tables should be updatable if it is
theoretically possible.
Example: A view showing active customers (SELECT * FROM Customer WHERE Status =
‘Active’) must allow updates to the underlying Customer table.
hp
Use Case: Updating a View
Scenario: Update customer email via a view that filters active customers.
Implementation in PostgreSQL:

CREATE VIEW ActiveCustomers AS


es
SELECT Customer_ID, Name, Email
FROM Customer
WHERE Status = 'Active';

UPDATE ActiveCustomers
ur

SET Email = '[email protected]'


WHERE Customer_ID = 1;

Outcome: The update is reflected in the base table, maintaining compliance with this rule.
Rule 7: High-Level Insert, Update, and Delete:
r.S

Description: The system must support high-level operations on sets of rows, not just one row
at a time.
Example: Update multiple rows in one query:
Use Case: Bulk Updating Employee Salaries
D

Scenario: Increase the salaries of all employees in the “IT” department by 10%.
Implementation in Oracle:

UPDATE Employee
SET Salary = Salary * 1.1
WHERE Department = 'IT';

Outcome: The operation updates multiple rows simultaneously.


Rule 8: Physical Data Independence:

24
NITK, Release DB COURSE PLAN-2024-25

Description: Changes to the physical storage of data must not affect how data is accessed at
the logical level.
Example: Moving the Customer table to a different disk or partition does not affect SQL
queries accessing it.
Use Case: Moving Data to a Different Storage System
Scenario: Migrate the Orders table to a different disk without affecting queries.
Implementation in MySQL:

h
The table is moved physically, but queries like SELECT * FROM Orders; continue to work
seamlessly.
Outcome: Changes in physical storage do not impact logical queries.

et
Rule 9: Logical Data Independence:
Description: Changes to the logical structure (schema) of a database must not affect applica-
tions accessing the data.

ar
Example: Adding a new column Middle_Name to the Customer table should not break existing
queries.
Use Case: Adding a New Column Without Breaking Existing Queries

queries.
Implementation in PostgreSQL:
hp
Scenario: Add a “LoyaltyPoints” column to the Customer table without affecting existing

ALTER TABLE Customer ADD LoyaltyPoints INT DEFAULT 0;


es
Outcome: Existing queries like SELECT Name, Email FROM Customer; remain unaffected.
Rule 10: Integrity Independence:
Description: Integrity constraints (like primary keys, foreign keys, and check constraints)
must be defined and stored in the database, not enforced by application logic.
ur

Example: A FOREIGN KEY ensures that a value in the Order table’s Customer_ID column
exists in the Customer table.
Use Case: Enforcing a Foreign Key Constraint
Scenario: Ensure that every order references a valid customer.
r.S

Implementation in Oracle:

ALTER TABLE Orders


ADD CONSTRAINT FK_Customer
FOREIGN KEY (Customer_ID)
REFERENCES Customer(Customer_ID);
D

Outcome: The database enforces integrity, ensuring valid references.


Rule 11: Distribution Independence:
Description: The database must function correctly regardless of whether it is distributed
across multiple locations or centralized.
Example: A query retrieving data from a distributed database behaves the same way as if it
were accessing a single database.
Use Case: Querying a Distributed Database

25
NITK, Release DB COURSE PLAN-2024-25

Scenario: Access data from a replicated database setup.


Implementation in PostgreSQL:
Using logical replication, data is distributed across nodes, but queries like SELECT * FROM
Orders; return consistent results regardless of distribution.
Outcome: The system handles distribution transparently.
Rule 12: Non-Subversion Rule:
Description: If the system provides a low-level access method, it must not bypass the integrity

h
and security constraints of the database.
Example: Even if accessing a database through an API, constraints like primary keys must
still be enforced.

et
Use Case: Enforcing Constraints Even in Direct API Access
Scenario: A developer inserts data via an API that directly interacts with the database.
Implementation in MySQL:

ar
Primary key and foreign key constraints ensure that invalid data cannot bypass rules, even
when using APIs.
Outcome: The system enforces integrity at all access levels.
hp
Rule MySQL PostgreSQL Oracle
Information Rule Fully supported Fully supported Fully supported
Guaranteed Access Rule Fully supported Fully supported Fully supported
NULL Handling Basic Advanced Advanced
es
Online Catalog Comprehensive Extensive Extensive
Comprehensive Language SQL SQL + PL/pgSQL SQL + PL/SQL
View Updatability Limited Flexible Flexible
High-Level Operations Standard Standard Advanced (MERGE)
Physical Independence Fully supported Fully supported Fully supported
ur

Logical Independence Basic Advanced Extensive


Integrity Enforcement Basic Advanced Advanced
Distribution Standard Advanced Highly Advanced
Non-Subversion Fully supported Fully supported Fully supported
r.S

Practical Comparison Across Systems:

Rule MySQL PostgreSQL Oracle


View Updatability Limited Flexible Advanced
NULL Handling Basic Advanced Advanced
D

Metadata Access Standard Extensive Extensive


Logical Independence Standard Advanced Advanced
Integrity Enforcement Limited Advanced Advanced
Distribution Basic Advanced Oracle RAC

Significance of Codd’s Rules:


Standardization: They provide a clear framework to evaluate and compare relational
databases.

26
NITK, Release DB COURSE PLAN-2024-25

Data Integrity: Emphasize strong data integrity and consistency through systematic con-
straints.
Flexibility: Ensure logical and physical independence, allowing changes without disrupting
users or applications.
Usability: Advocate for comprehensive language support and accessible data structures, mak-
ing relational databases user-friendly.
Practical Relevance:
While no commercial database strictly adheres to all 12 rules, relational databases like

h
MySQL, PostgreSQL, and Oracle Database implement most of them, ensuring robust and
reliable data management.

et
9.1 Relationships in RDBMS

ar
Cardinality:
Cardinality defines the number of instances in one table that can be associated with in-
stances in another table.
Types of Cardinality: hp
1. One-to-One (1:1) – Each record in Table A is linked to one record in Table B.
2. One-to-Many (1:M) – Each record in Table A can be linked to multiple records in Table
B.
3. Many-to-Many (M:N) – Each record in Table A can be linked to multiple records in
es
Table B and vice versa.
Examples:
1. One-to-One:
ur

CREATE TABLE Passport (


passport_id INT PRIMARY KEY,
person_id INT UNIQUE,
FOREIGN KEY (person_id) REFERENCES Person(person_id)
);
r.S

2. One-to-Many:

CREATE TABLE Orders (


order_id INT PRIMARY KEY,
customer_id INT,
FOREIGN KEY (customer_id) REFERENCES Customers(customer_id)
);
D

3. Many-to-Many:

CREATE TABLE Student_Course (


student_id INT,
course_id INT,
PRIMARY KEY (student_id, course_id),
FOREIGN KEY (student_id) REFERENCES Students(student_id),
FOREIGN KEY (course_id) REFERENCES Courses(course_id)
);

9.1. Relationships in RDBMS 27


NITK, Release DB COURSE PLAN-2024-25

Types of Relationships:
• One-to-One (1:1) – Each record in Table A has exactly one related record in Table B.
• One-to-Many (1:M) – A record in Table A can have multiple related records in Table B.
• Many-to-Many (M:N) – Multiple records in Table A relate to multiple records in Table
B. - Requires a junction table.

9.2 Keys in RDBMS

h
Keys in Relational Database Management Systems (RDBMS) play a vital role in ensuring

et
data integrity, uniqueness, and relationships between tables.
Types of Keys in RDBMS:
• Primary Key (PK) – Uniquely identifies a record in a table.

ar
• Foreign Key (FK) – Establishes relationships between tables.
• Unique Key – Ensures column values are unique (but allows NULL).
• Composite Key – A combination of multiple columns as a primary key.
hp
• Candidate Key - A minimal set of attributes that can uniquely identify a row.
• Super Key - A superset of a *Candidate Key.
Primary Key (PK):
A Primary Key is a column (or a set of columns) that uniquely identifies each record in a
table.
es
• Must be unique.
• Cannot contain NULL values.
• A table can have only one primary key.
ur

Example:
CREATE TABLE Students (
student_id INT PRIMARY KEY, -- Unique and NOT NULL
name VARCHAR(100),
r.S

email VARCHAR(100) UNIQUE


);

Foreign Key (FK):


A Foreign Key is a column (or set of columns) that establishes a relationship between tables.
• Refers to a Primary Key in another table.
D

• Ensures referential integrity.


• Helps maintain data consistency.
Example:
CREATE TABLE Courses (
course_id INT PRIMARY KEY,
course_name VARCHAR(100)
);
(continues on next page)

9.2. Keys in RDBMS 28


NITK, Release DB COURSE PLAN-2024-25

(continued from previous page)

CREATE TABLE Enrollments (


enrollment_id INT PRIMARY KEY,
student_id INT,
course_id INT,
FOREIGN KEY (student_id) REFERENCES Students(student_id),
FOREIGN KEY (course_id) REFERENCES Courses(course_id)
);

h
Unique Key:
A Unique Key constraint ensures that all values in a column are unique but allows NULL

et
values.
• A table can have multiple unique keys.
• Unlike the Primary Key, it permits NULL values.

ar
Example:

CREATE TABLE Employees (


emp_id INT PRIMARY KEY,
email VARCHAR(100) UNIQUE,
phone_number VARCHAR(15) UNIQUE -- Can be NULL but must be unique
hp
);

Composite Key:
A Composite Key is a combination of multiple columns that uniquely identifies a row in a
table.
es
• Used when a single column is not sufficient to uniquely identify records.
• The combination of columns must be unique.
Example:
ur

CREATE TABLE Orders (


order_id INT,
product_id INT,
quantity INT,
PRIMARY KEY (order_id, product_id) -- Composite Key
r.S

);

Candidate Key:
A Candidate Key is a minimal set of attributes that can uniquely identify a row in a table. A
table can have multiple candidate keys, but only one is chosen as the Primary Key.
• Minimal means that removing any attribute from the key would make it non-unique.
D

• One of the candidate keys is chosen as the Primary Key.


Example:

CREATE TABLE Library (


book_id INT,
isbn VARCHAR(20),
title VARCHAR(255),
author VARCHAR(255),
(continues on next page)

9.2. Keys in RDBMS 29


NITK, Release DB COURSE PLAN-2024-25

(continued from previous page)


PRIMARY KEY (book_id), -- Chosen as Primary Key
UNIQUE (isbn) -- Another Candidate Key
);

Super Key:
A Super Key is a superset of a Candidate Key. It includes additional attributes that are not
necessary for uniqueness but still uniquely identify a row.
• A Candidate Key is the minimal version of a Super Key.

h
• Super Keys may contain extra attributes that do not contribute to uniqueness.
Example:

et
CREATE TABLE Customers (
customer_id INT,
email VARCHAR(100),
phone VARCHAR(20),

ar
address VARCHAR(255),
PRIMARY KEY (customer_id),
UNIQUE (email)
);

Here:
• {customer_id} is a Candidate Key.
hp
• {customer_id, email, phone} is a Super Key (it contains extra attributes but still
uniquely identifies a row).
es
Comparison of Keys:

Table 1: Key Differences in RDBMS


Key Type Uniqueness Allows Number Per Table Use Case
ur

NULLs?
Pri- Yes No One per table Uniquely identifies each
mary record
Key
Foreign No Yes Multiple Establishes relationships
r.S

Key between tables


Unique Yes Yes Multiple Ensures column values are
Key (single unique but allows NULLs
NULL)
Com- Yes (as a No One per table (can Identifies records when a
posite combina- include multiple single column is insufficient
D

Key tion) columns)


Can- Yes No Multiple Potential primary key op-
didate tions
Key
Super Yes No Multiple Includes candidate key plus
Key extra attributes

Importance of Keys in RDBMS:


• Data Integrity – Prevents duplicate or inconsistent data.

9.2. Keys in RDBMS 30


NITK, Release DB COURSE PLAN-2024-25

• Uniqueness – Ensures proper identification of records.


• Relationships – Establishes meaningful links between tables.
• Performance Optimization – Indexing on keys speeds up queries.
Example:

CREATE TABLE Students (


student_id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100) UNIQUE

h
);

CREATE TABLE Enrollments (

et
enrollment_id INT PRIMARY KEY,
student_id INT,
course_id INT,
FOREIGN KEY (student_id) REFERENCES Students(student_id)
);

ar
CREATE TABLE Orders (
order_id INT,
product_id INT,
quantity INT, hp
PRIMARY KEY (order_id, product_id) -- Composite Key
);
es
ur
r.S
D

9.2. Keys in RDBMS 31


10

h
Constraints in RDBMS

et
ar
Constraints in a Relational Database Management System (RDBMS) are rules applied
to table columns to ensure data integrity and accuracy. They restrict the type of data that
can be stored and maintain the consistency of the database.
Types of Constraints in RDBMS hp
1. PRIMARY KEY Constraint
• A PRIMARY KEY uniquely identifies each record in a table.
• It must contain unique values and cannot be NULL.
• A table can have only one PRIMARY KEY, which may consist of single or
es
multiple columns (Composite Key).

CREATE TABLE Employees (


emp_id INT PRIMARY KEY,
name VARCHAR(100),
age INT
ur

);

2. FOREIGN KEY Constraint


• A FOREIGN KEY is used to maintain a relationship between two tables.
r.S

• It references the PRIMARY KEY in another table.


• Ensures referential integrity, meaning values in the foreign key column
must exist in the referenced table.

CREATE TABLE Orders (


order_id INT PRIMARY KEY,
D

emp_id INT,
FOREIGN KEY (emp_id) REFERENCES Employees(emp_id)
);

3. UNIQUE Constraint
• Ensures that all values in a column are unique.
• Unlike PRIMARY KEY, a table can have multiple UNIQUE constraints.
• NULL values are allowed unless specified otherwise.

32
NITK, Release DB COURSE PLAN-2024-25

CREATE TABLE Users (


user_id INT PRIMARY KEY,
email VARCHAR(100) UNIQUE
);

4. NOT NULL Constraint


• Ensures that a column cannot store NULL values.
• Used when a field must always have a value (e.g., Employee Name, Date

h
of Birth).

CREATE TABLE Products (


product_id INT PRIMARY KEY,

et
product_name VARCHAR(100) NOT NULL,
price DECIMAL(10,2) NOT NULL
);

ar
5. CHECK Constraint
• Defines a condition that must be met before inserting or updating data.
• Helps enforce business rules (e.g., age must be > 18).

CREATE TABLE Students (

);
student_id INT PRIMARY KEY,
age INT CHECK (age >= 18)
hp
6. DEFAULT Constraint
es
• Assigns a default value if no value is provided during insertion.

CREATE TABLE Accounts (


account_id INT PRIMARY KEY,
balance DECIMAL(10,2) DEFAULT 0.00
);
ur

7. AUTO_INCREMENT Constraint (MySQL) / SERIAL (PostgreSQL)


• Automatically generates unique sequential values for a column (com-
monly used for primary keys).
r.S

Example (MySQL):

CREATE TABLE Customers (


customer_id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100)
);
D

Example (PostgreSQL):

CREATE TABLE Customers (


customer_id SERIAL PRIMARY KEY,
name VARCHAR(100)
);

33
NITK, Release DB COURSE PLAN-2024-25

Constraint Description
PRIMARY KEY Uniquely identifies each row (must be unique and NOT
NULL).
FOREIGN KEY Ensures referential integrity between two tables.
UNIQUE Ensures all values in a column are unique.
NOT NULL Ensures a column cannot have NULL values.
CHECK Restricts values based on a condition.
DEFAULT Provides a default value if no value is specified.

h
AUTO_INCREMENT / SE- Automatically generates unique numbers for a column.
RIAL

et
Why Use Constraints?
✓ Ensures data integrity and accuracy
✓ Prevents invalid data entry

ar
✓ Enforces business rules at the database level
✓ Reduces the need for manual validation in applications

hp
es
ur
r.S
D

34
11

h
Structured Query Language (SQL)

et
ar
Structured Query Language (SQL) is used to interact with RDBMS.

11.1 Types of SQL Commands: hp


• DDL (Data Definition Language): CREATE TABLE, ALTER TABLE, DROP TABLE
• DML (Data Manipulation Language): INSERT, UPDATE, DELETE
• DQL (Data Query Language): SELECT
es
• DCL (Data Control Language): GRANT, REVOKE
• TCL (Transaction Control Language): COMMIT, ROLLBACK
Definition:
ur

Structured Query Language (SQL) is a standard language used to interact with Relational
Database Management Systems (RDBMS). It allows users to create, manipulate, and
retrieve data efficiently.
Key Features of SQL:
r.S

• Declarative Language – Users specify what they want, and the system determines how
to execute it.
• Standardized Language – Used across multiple RDBMS platforms like MySQL, Post-
greSQL, SQL Server, and Oracle.
• Powerful Query Capabilities – Supports filtering, aggregation, and joins for complex
data retrieval.
D

• Data Integrity & Security – Includes constraints, transactions, and access control
mechanisms.
• Scalability & Performance – Optimized for handling large datasets efficiently.
Types of SQL Commands:
SQL is categorized into five main types:
1. Data Definition Language (DDL) – Defines and modifies database structure.

35
NITK, Release DB COURSE PLAN-2024-25

• CREATE – Creates databases, tables, indexes, and views.


• ALTER – Modifies existing database structures.
• DROP – Deletes databases or tables.
• TRUNCATE – Removes all records from a table but keeps its structure.
Example:

CREATE TABLE Students (


student_id INT PRIMARY KEY,

h
name VARCHAR(100),
email VARCHAR(100) UNIQUE
);

et
2. Data Manipulation Language (DML) – Handles data operations.
• INSERT – Adds new records.
• UPDATE – Modifies existing records.

ar
• DELETE – Removes records.
Example:

INSERT INTO Students (student_id, name, email) VALUES (1, 'Alice', 'alice@example.
hp
,→com');

3. Data Query Language (DQL) – Retrieves data from databases.


• SELECT – Fetches records based on conditions.
Example:
es

SELECT * FROM Students WHERE student_id = 1;

4. Data Control Language (DCL) – Manages user permissions.


ur

• GRANT – Assigns privileges to users.


• REVOKE – Removes privileges.
Example:

GRANT SELECT ON Students TO user1;


r.S

5. Transaction Control Language (TCL) – Manages transactions.


• COMMIT – Saves changes permanently.
• ROLLBACK – Undoes changes if errors occur.
D

• SAVEPOINT – Creates points for partial rollbacks.


Example:

BEGIN TRANSACTION;
UPDATE Students SET email = '[email protected]' WHERE student_id = 1;
ROLLBACK;

SQL and RDBMS Interaction:


SQL allows users to interact with an RDBMS through the following actions:

11.1. Types of SQL Commands: 36


NITK, Release DB COURSE PLAN-2024-25

1. Creating Databases & Tables – Defining database schemas and structures.


2. Inserting & Managing Data – Populating tables with records.
3. Retrieving Information – Querying data using SELECT.
4. Updating & Deleting Data – Modifying or removing specific records.
5. Establishing Relationships – Using FOREIGN KEYS to maintain integrity.
6. Ensuring Data Integrity – Applying constraints like PRIMARY KEY, UNIQUE, and NOT
NULL.

h
Importance of SQL in RDBMS:
• Data Management – Efficiently handles structured data.

et
• Data Integrity & Security – Prevents unauthorized access and maintains consistency.
• Scalability – Supports large-scale applications and complex queries.
• Standardization – Universally accepted across different database systems.

ar
11.2 SQL Basics
hp
Introduction to SQL: creating tables, basic queries, and DML operations (INSERT, UPDATE,
DELETE).
1. Creating Tables in SQL:
Tables are the fundamental building blocks of a relational database, where data is stored in
rows and columns.
es
Syntax for Creating a Table

CREATE TABLE table_name (


column1 datatype [constraints],
ur

column2 datatype [constraints],


column3 datatype [constraints],
...
);

Explanation
r.S

• table_name: Name of the table to create.


• column1, column2, ...: Names of the columns in the table.
• datatype: Data type for each column (e.g., INT, VARCHAR, DATE).
• constraints: Rules applied to the columns (e.g., NOT NULL, PRIMARY KEY, UNIQUE).
D

Example

CREATE TABLE Employees (


EmployeeID INT PRIMARY KEY,
FirstName VARCHAR(50) NOT NULL,
LastName VARCHAR(50) NOT NULL,
DateOfBirth DATE,
Salary DECIMAL(10, 2)
);

11.2. SQL Basics 37


NITK, Release DB COURSE PLAN-2024-25

This creates an Employees table with: - A unique EmployeeID as the primary key. - First and
last names (both required). - Date of birth and salary with specific data types.
2. Basic SQL Queries:
SQL queries are used to retrieve data from tables.
SELECT Statement
Used to retrieve specific columns or all columns from a table.
Syntax

h
SELECT column1, column2, ...
FROM table_name

et
[WHERE condition]
[ORDER BY column [ASC|DESC]];

Example

ar
-- Retrieve all columns
SELECT * FROM Employees;

-- Retrieve specific columns


SELECT FirstName, LastName, Salary FROM Employees;

-- Retrieve records with conditions


SELECT * FROM Employees
WHERE Salary > 50000;
hp
-- Retrieve records ordered by salary (descending)
SELECT FirstName, Salary
es
FROM Employees
ORDER BY Salary DESC;

3. DML Operations:
Data Manipulation Language (DML) commands are used to modify data within tables. These
ur

include INSERT, UPDATE, and DELETE.


A. INSERT Statement
Used to add new records to a table.
r.S

Syntax

INSERT INTO table_name (column1, column2, ...)


VALUES (value1, value2, ...);

Example
D

-- Insert a single row


INSERT INTO Employees (EmployeeID, FirstName, LastName, DateOfBirth, Salary)
VALUES (1, 'John', 'Doe', '1985-06-15', 75000.00);

-- Insert multiple rows


INSERT INTO Employees (EmployeeID, FirstName, LastName, DateOfBirth, Salary)
VALUES
(2, 'Jane', 'Smith', '1990-04-12', 68000.00),
(3, 'Emily', 'Jones', '1988-09-22', 72000.00);

11.2. SQL Basics 38


NITK, Release DB COURSE PLAN-2024-25

B. UPDATE Statement
Used to modify existing records in a table.
Syntax

UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;

Example

h
-- Update salary for a specific employee

et
UPDATE Employees
SET Salary = 80000.00
WHERE EmployeeID = 1;

-- Update salaries for all employees with salary below a certain amount

ar
UPDATE Employees
SET Salary = Salary + 5000.00
WHERE Salary < 70000.00;

C. DELETE Statement
Used to remove records from a table.
hp
Syntax

DELETE FROM table_name


WHERE condition;
es

Example

-- Delete a specific employee


ur

DELETE FROM Employees


WHERE EmployeeID = 2;

-- Delete all employees with a salary less than a certain amount

DELETE FROM Employees


r.S

WHERE Salary < 60000.00;

Key Points to Remember:


• Data Types: Always use appropriate data types for columns to optimize storage and
ensure data integrity.
D

• Constraints: Use constraints like PRIMARY KEY, NOT NULL, and FOREIGN KEY to
maintain data consistency.
• WHERE Clause: Always use WHERE with UPDATE and DELETE to avoid unintentional
changes to all rows.
• SELECT *: Avoid using SELECT * in production; explicitly list required columns for better
performance.

11.2. SQL Basics 39


NITK, Release DB COURSE PLAN-2024-25

11.3 Altering Table Structure in MySQL

The ALTER TABLE statement in MySQL is used to modify the structure of an existing table.
You can use it to add, modify, or delete columns, as well as to add or remove constraints.
1. Add a New Column:
To add a new column to an existing table:
Syntax

h
ALTER TABLE table_name
ADD column_name datatype [constraints];

et
Example

-- Add a new column 'Email' to the 'Employees' table


ALTER TABLE Employees

ar
ADD Email VARCHAR(100);

2. Modify an Existing Column:


To change the datatype or constraints of an existing column:
Syntax

ALTER TABLE table_name


hp
MODIFY column_name new_datatype [new_constraints];

Example
es
-- Change the datatype of 'Salary' to FLOAT
ALTER TABLE Employees
MODIFY Salary FLOAT;

3. Rename a Column:
ur

To rename an existing column::


Syntax

ALTER TABLE table_name


r.S

CHANGE old_column_name new_column_name datatype [constraints];

Example

-- Rename 'DateOfBirth' to 'DOB'


ALTER TABLE Employees
CHANGE DateOfBirth DOB DATE;
D

4. Drop a Column:
To delete a column from an existing table:
Syntax

ALTER TABLE table_name


DROP COLUMN column_name;

Example

11.3. Altering Table Structure in MySQL 40


NITK, Release DB COURSE PLAN-2024-25

-- Remove the 'Email' column from the 'Employees' table


ALTER TABLE Employees
DROP COLUMN Email;

5. Add Constraints:
To add constraints to an existing table:
Syntax

ALTER TABLE table_name

h
ADD CONSTRAINT constraint_name constraint_type (column_name);

Example

et
-- Add a UNIQUE constraint to the 'Email' column
ALTER TABLE Employees
ADD CONSTRAINT unique_email UNIQUE (Email);

ar
-- Add a FOREIGN KEY constraint
ALTER TABLE Employees
ADD CONSTRAINT fk_department_id FOREIGN KEY (DepartmentID) REFERENCES␣
,→Departments(DepartmentID);

6. Drop Constraints:
hp
To remove constraints from an existing table:
Syntax

ALTER TABLE table_name


es
DROP FOREIGN KEY constraint_name;

Example

-- Drop the foreign key constraint


ur

ALTER TABLE Employees


DROP FOREIGN KEY fk_department_id;

Key Considerations:
• Data Loss: Dropping or modifying columns can lead to data loss.
r.S

• Dependent Objects: Ensure that indexes or constraints tied to the columns are consid-
ered before modification.
• MySQL-Specific Syntax: The CHANGE keyword is unique to MySQL for renaming
columns.
D

11.3. Altering Table Structure in MySQL 41


12

h
ACID Properties in RDBMS

et
ar
• Atomicity – A transaction is all or nothing.
• Consistency – The database remains in a valid state.
• Isolation – Multiple transactions do not interfere with each other.
hp
• Durability – Once committed, data is permanently recorded.
es
ur
r.S
D

42
13

h
Transactions in RDBMS

et
ar
A transaction in SQL is a sequence of operations performed as a single logical unit of work.
Transactions help maintain the integrity and consistency of a database, especially in multi-
user environments.
What is a Transaction? hp
Transaction control ensures data integrity.
A transaction is a group of SQL operations that are executed together, following the ACID
properties:
• Atomicity: All operations succeed or none do.
es
• Consistency: The database remains in a valid state before and after the transaction.
• Isolation: Transactions are isolated from each other.
• Durability: Once committed, changes are permanent even after a system failure.
Transaction Control Commands:
ur

• BEGIN TRANSACTION / START TRANSACTION: Starts a new transaction.


• COMMIT: Saves the changes made during the transaction.
• ROLLBACK: Reverts changes to the last committed state.
r.S

• SAVEPOINT: Creates a checkpoint within a transaction for partial rollbacks.


• RELEASE SAVEPOINT: Deletes a savepoint.
• ROLLBACK TO SAVEPOINT: Rolls back to a specific savepoint.

BEGIN TRANSACTION;
D

DELETE FROM orders WHERE order_id = 101;


ROLLBACK; -- Undo the delete operation

SQL Transactions: Role of SAVEPOINT, COMMIT, and ROLLBACK


COMMIT:
Role:
• Permanently saves all changes made during the current transaction.
• After a COMMIT, changes cannot be rolled back.

43
NITK, Release DB COURSE PLAN-2024-25

Example:

START TRANSACTION;

UPDATE Accounts
SET Balance = Balance - 500
WHERE AccountID = 101;

UPDATE Accounts
SET Balance = Balance + 500
WHERE AccountID = 202;

h
COMMIT;

et
• Explanation: Money is transferred from Account 101 to 202.
• Impact: Both updates are saved permanently.
ROLLBACK:

ar
Role:
• Reverts the database to its state before the transaction started.
• Used to handle errors or invalid data during a transaction.
Example:

START TRANSACTION;
hp
UPDATE Accounts
SET Balance = Balance - 1000
es
WHERE AccountID = 101;

-- Simulating an error (e.g., balance goes negative)


IF (SELECT Balance FROM Accounts WHERE AccountID = 101) < 0 THEN
ROLLBACK;
ELSE
ur

COMMIT;
END IF;

• Explanation: If the balance goes negative, the transaction is rolled back.


• Impact: No changes are saved if an error occurs.
r.S

SAVEPOINT:
Role:
• Creates a checkpoint within a transaction.
• Allows partial rollbacks to specific points without rolling back the entire transaction.
D

Example:

START TRANSACTION;

INSERT INTO Orders (OrderID, Product, Quantity) VALUES (1, 'Laptop', 2);
SAVEPOINT sp1;

INSERT INTO Orders (OrderID, Product, Quantity) VALUES (2, 'Phone', 3);
SAVEPOINT sp2;
(continues on next page)

44
NITK, Release DB COURSE PLAN-2024-25

(continued from previous page)

INSERT INTO Orders (OrderID, Product, Quantity) VALUES (3, 'Tablet', 1);

-- Simulate an error
ROLLBACK TO sp2;

COMMIT;

• Explanation:

h
– sp1 and sp2 are checkpoints.
– The insertion of the Tablet is rolled back, but Laptop and Phone are saved.

et
• Impact: Only the last operation (after sp2) is undone.
ROLLBACK TO SAVEPOINT:
Role:

ar
• Rolls back part of a transaction to a specific savepoint without affecting earlier opera-
tions.
Example:

START TRANSACTION;
hp
UPDATE Products SET Stock = Stock - 10 WHERE ProductID = 1;
SAVEPOINT sp1;

UPDATE Products SET Stock = Stock - 5 WHERE ProductID = 2;


es
SAVEPOINT sp2;

-- Found an issue with ProductID 2


ROLLBACK TO sp1;

COMMIT;
ur

• Explanation:
– The change to ProductID = 2 is undone.
– The change to ProductID = 1 remains.
r.S

Complete Example with COMMIT, ROLLBACK, and SAVEPOINT:

START TRANSACTION;

-- Step 1: Deduct balance from sender


UPDATE Accounts SET Balance = Balance - 1000 WHERE AccountID = 101;
D

SAVEPOINT sp1;

-- Step 2: Add balance to receiver


UPDATE Accounts SET Balance = Balance + 1000 WHERE AccountID = 202;
SAVEPOINT sp2;

-- Step 3: Error simulation (account not found)


UPDATE Accounts SET Balance = Balance + 500 WHERE AccountID = 999; -- Invalid

-- Step 4: Rollback to last successful operation


(continues on next page)

45
NITK, Release DB COURSE PLAN-2024-25

(continued from previous page)


ROLLBACK TO sp2;

-- Finalizing the transaction


COMMIT;

Explanation:
• Step 1 & 2: Successful transactions, saved with savepoints.
• Step 3: Error occurs while updating an invalid account.

h
• Step 4: Rolled back to sp2, keeping previous updates intact.
• COMMIT: Finalizes the successful parts.

et
Comparison of Transaction Commands:

Command Purpose Can Roll- Permanent

ar
back? Changes?
START TRANSACTION Begins a new transaction Yes No
COMMIT Saves all changes perma- No Yes
nently
ROLLBACK

SAVEPOINT
transaction
hp
Reverts all changes in the

Creates a checkpoint
Yes

Yes (to save-


No

No
point)
ROLLBACK TO SAVE- Rolls back to a specific save- Yes No
POINT point
es
RELEASE SAVEPOINT Deletes a savepoint No No

• Transactions ensure data integrity with ACID properties.


• COMMIT finalizes changes, while ROLLBACK undoes them.
ur

• SAVEPOINT allows partial rollbacks without affecting the entire transaction.


The role of DELETE, TRUNCATE, and DROP in SQL Transactions:
The DELETE, TRUNCATE, and DROP are part of SQL operations, but their behavior in the context
of transactions varies depending on the database system.
r.S

DELETE in Transactions:
Role:
• DELETE is a DML command that removes specific rows from a table.
• It fully supports transactions, meaning: - You can rollback a DELETE operation if it’s
D

part of an uncommitted transaction. - Changes become permanent only after a COMMIT.


Example with DELETE:

START TRANSACTION;

DELETE FROM Employees WHERE Department = 'HR';

-- Oops! Realized this was a mistake


ROLLBACK; -- Undoes the DELETE operation
(continues on next page)

46
NITK, Release DB COURSE PLAN-2024-25

(continued from previous page)

COMMIT; -- Would have made the DELETE permanent if used instead

• Key Point: The ROLLBACK successfully undoes the DELETE.


• Supported by: MySQL, PostgreSQL, Oracle, etc.
TRUNCATE in Transactions:
Role:

h
• TRUNCATE is a DDL-like operation (although it affects data).
• Its behavior varies by database system:

et
Database System Transaction Support for TRUNCATE
PostgreSQL Supports rollback for TRUNCATE.

ar
MySQL (InnoDB) Does NOT support rollback. Auto-committed.
Oracle Cannot rollback. Auto-commits immediately.

Example in MySQL (No Rollback):

START TRANSACTION;

TRUNCATE TABLE Orders;


hp
ROLLBACK; -- This will NOT undo the TRUNCATE in MySQL
es
COMMIT;

• Key Point: In MySQL, TRUNCATE commits immediately, making rollback impossible.


• Supported (with rollback) by: PostgreSQL (but not MySQL or Oracle).
ur

DROP in Transactions
Role: - DROP is a DDL command that removes entire database objects (tables, views, etc.).
- Like TRUNCATE, its behavior depends on the DBMS:
r.S

Database System Transaction Support for DROP


PostgreSQL Supports rollback for DROP.
MySQL (InnoDB) Auto-commits the transaction immediately.
Oracle Cannot rollback. Auto-committed.

Example in MySQL (No Rollback):


D

START TRANSACTION;

DROP TABLE Customers;

ROLLBACK; -- This will NOT restore the dropped table in MySQL

COMMIT;

• Key Point: In MySQL, DROP is auto-committed, and cannot be rolled back.

47
NITK, Release DB COURSE PLAN-2024-25

• Supported (with rollback) by: PostgreSQL.


Comparison of DELETE, TRUNCATE, and DROP in Transactions:

Com- Type Affects Data or Rollback Support? Auto-


mand Structure? Commit?
DELETE DML Data Yes, fully rollback supported No
TRUN- DDL-like Data (entire ta- Depends on DBMS (Post- Auto-commit
CATE (Data) ble) greSQL: , MySQL: ) in MySQL

h
DROP DDL Structure + Data Depends on DBMS (Post- Auto-commit
greSQL: , MySQL: ) in MySQL

et
1. DELETE fully supports transactions: - Changes can be rolled back before committing. -
Works like other DML commands (INSERT, UPDATE).
2. TRUNCATE behaves like a DDL command: - Rollback supported in PostgreSQL. - Auto-
committed in MySQL and Oracle (cannot be rolled back).

ar
3. DROP is a DDL command: - Rollback supported in PostgreSQL. - Auto-committed in
MySQL and Oracle (cannot be rolled back).
4. Best Practice: - Always test transaction behaviors in your specific DBMS. - Use DELETE
when you need rollback capability. - Be cautious with TRUNCATE and DROP, especially in
hp
MySQL.
DELETE vs TRUNCATE vs DROP in SQL:
The key differences between DELETE, TRUNCATE, and DROP commands in SQL. These commands
are used for managing data and database objects, but they serve different purposes.
es
• DELETE: Removes specific records from a table based on a condition.
• TRUNCATE: Deletes all records from a table quickly without logging individual row dele-
tions.
• DROP: Removes the entire table, including its structure and data.
ur

DELETE Command:
Purpose:
The DELETE command is used to remove specific rows from a table using the WHERE clause.
r.S

Syntax:

DELETE FROM table_name


WHERE condition;

Example:
D

DELETE FROM Employees


WHERE Department = 'HR';

• Explanation: Deletes all employees from the HR department.


Key Points:
• Deletes specific rows if a condition is provided.
• If no WHERE clause is used, all rows will be deleted.
• Can be rolled back using transactions (COMMIT/ROLLBACK).

48
NITK, Release DB COURSE PLAN-2024-25

• Triggers (if defined) are activated.


When to Use?
• When you need to delete specific records while keeping the table structure intact.
TRUNCATE Command:
Purpose:
The TRUNCATE command is used to quickly delete all records from a table.
Syntax:

h
TRUNCATE TABLE table_name;

et
Example:

TRUNCATE TABLE Employees;

• Explanation: Deletes all rows from the Employees table.

ar
Key Points:
• Deletes all records from the table.
• Cannot be rolled back in some databases (DBMS-dependent).
hp
• Resets auto-increment counters.
• Faster than DELETE because it minimizes logging.
• Does not activate triggers in most databases.
When to Use?
es
• When you need to quickly delete all data while keeping the table structure intact.
DROP Command:
Purpose:
ur

The DROP command is used to remove database objects like tables, views, indexes, or entire
databases.
Syntax:

DROP TABLE table_name;


r.S

Example:

DROP TABLE Employees;

• Explanation: Completely removes the Employees table from the database, including
D

its structure.
Key Points:
• Deletes the entire table (structure + data).
• Cannot be rolled back in most databases.
• Removes all constraints, indexes, and triggers associated with the table.
When to Use?
• When you need to completely remove a table from the database.

49
NITK, Release DB COURSE PLAN-2024-25

Comparison Table: DELETE vs TRUNCATE vs DROP:

Criteria DELETE TRUNCATE DROP


Purpose Deletes specific rows Deletes all rows Removes the table
quickly structure and data
Syntax DELETE FROM table TRUNCATE TABLE DROP TABLE table;
WHERE condition; table;
Condition Yes (WHERE clause sup- No (WHERE clause not Not applicable
Support ported) supported)

h
Rollback Yes (if within a transac- Depends on DBMS No (cannot be rolled
Support tion) back)
Auto- No Yes Yes

et
Increment
Reset
Affects No No Yes (removes struc-
Structure? ture)

ar
Triggers Yes (triggers are fired) No (triggers are not No
fired)
Performance Slower for large Faster than DELETE Fastest (removes the
datasets object)
Use Case Delete specific records hp Remove all records, Completely remove
keep structure the table

Differences:
1. Data vs Structure:
es
• DELETE: Removes data but keeps the table structure.
• TRUNCATE: Removes all data but keeps the structure intact.
• DROP: Removes both the data and the structure.
2. Transaction Control:
ur

• DELETE supports rollback with transactions.


• TRUNCATE may or may not support rollback (DBMS-dependent).
• DROP is irreversible in most databases.
r.S

3. Performance:
• TRUNCATE is faster than DELETE for large datasets.
• DROP is the fastest as it completely removes the table.
Real-World Use Cases:
D

1. Deleting Specific Records:

DELETE FROM Orders WHERE OrderDate < '2022-01-01';

• Use Case: Clean old data without affecting current records.


2. Clearing an Entire Table:

TRUNCATE TABLE Logs;

• Use Case: Periodically clear logs while keeping the table structure.

50
NITK, Release DB COURSE PLAN-2024-25

3. Removing a Table Completely:

DROP TABLE Temp_Data;

• Use Case: Remove a temporary table that is no longer needed.


Remarks:
• Use DELETE when you need to remove specific records while preserving the table struc-
ture.

h
• Use TRUNCATE when you need to quickly delete all data from a table while keeping its
schema intact.
• Use DROP when you want to completely remove a table or database object.

et
ar
hp
es
ur
r.S
D

51
14

h
RDBMS vs. NoSQL

et
ar
Feature RDBMS NoSQL
Data Model Tables with relations Document, Key-Value, Graph
Schema Fixed schema Flexible schema
Query Language SQL hp NoSQL (varies by type)
Scalability Vertical Scaling Horizontal Scaling
Use Case OLTP (Transactional Apps) Big Data, Real-Time Apps
es
ur
r.S
D

52
15

h
SMART Health Management Database

et
ar
The Smart Health Management System (SHMS) revolutionizes traditional healthcare prac-
tices by leveraging advanced technologies to provide personalized, efficient, and accessible
healthcare services. It bridges the gap between patients and healthcare providers, ensuring
better health outcomes and quality of life in the digital era.
hp
Description: Develop a system to manage patient records, doctor schedules, and medical
inventory using a centralized database. Incorporate analytics to predict patient trends and
automate reminders for appointments and medication. Trends: Healthcare data analytics,
IoT integration for health monitoring.
The Smart Healthcare Management System will have the following key entities and rela-
es
tionships:
Entities:
• Patient
• Doctor
ur

• Appointment
• Medical_Record
• Medication
r.S

• Pharmacy
• Hospital_Staff
• Room
• Billing
D

ER Diagram Overview:
Here’s the breakdown of entities, attributes, and their relationships:
Patient
Attributes:
• Patient_ID (PK)
• Name
• Age

53
NITK, Release DB COURSE PLAN-2024-25

• Gender
• Contact
• Address
• Email
Relationships:
• Makes appointments with doctors
• Has medical records

h
Doctor
Attributes:

et
• Doctor_ID (PK)
• Name
• Specialization

ar
• Contact
• Email
• Room_Assigned hp
Relationships:
• Attends to patients through appointments
Appointment
Attributes:
es
• Appointment_ID (PK)
• Patient_ID (FK)
• Doctor_ID (FK)
ur

• Appointment_Date
• Time
• Status
r.S

Relationships:
• Links patients and doctors
Medical_Record
Attributes:
• Record_ID (PK)
D

• Patient_ID (FK)
• Doctor_ID (FK)
• Diagnosis
• Treatment
• Date
Relationships:

54
NITK, Release DB COURSE PLAN-2024-25

• Belongs to patients
• Created by doctors
Medication
Attributes:
• Medication_ID (PK)
• Name
• Type

h
• Dosage
• Side_Effects

et
Relationships:
• Prescribed to patients (via Medical_Record)
Pharmacy

ar
Attributes:
• Pharmacy_ID (PK)
• Name hp
• Location
• Contact
Relationships:
• Provides medications
es
Hospital_Staff
Attributes:
• Staff_ID (PK)
ur

• Name
• Role
• Contact
r.S

Relationships:
• Assigned to hospital operations
Room
Attributes:
• Room_ID (PK)
D

• Type
• Availability_Status
Relationships:
• Assigned to patients or doctors
Billing
Attributes:

55
NITK, Release DB COURSE PLAN-2024-25

• Bill_ID (PK)
• Patient_ID (FK)
• Amount
• Date
• Status
Relationships:
• Linked to patients

h
The following provides the SQL schema and sample data for a Smart Health Management
System. It includes table creation, inserting data, and maintaining relationships in MySQL.

et
Database Creation:

CREATE DATABASE HealthManagementSystem;


USE HealthManagementSystem;

ar
Table Creation:
1. Patient Table

CREATE TABLE Patient ( hp


Patient_ID INT AUTO_INCREMENT PRIMARY KEY,
Name VARCHAR(100) NOT NULL,
Age INT CHECK (Age >= 0),
Gender ENUM('Male', 'Female', 'Other') NOT NULL,
Address VARCHAR(255),
Contact_Number VARCHAR(15),
es
Email VARCHAR(100),
Emergency_Contact VARCHAR(15),
Insurance_ID INT,
Registration_Date DATE DEFAULT CURRENT_DATE
);
ur

2. Doctor Table

CREATE TABLE Doctor (


Doctor_ID INT AUTO_INCREMENT PRIMARY KEY,
Name VARCHAR(100) NOT NULL,
r.S

Specialization VARCHAR(100) NOT NULL,


Qualification VARCHAR(50),
Contact_Number VARCHAR(15),
Email VARCHAR(100),
Availability VARCHAR(50),
Years_of_Experience INT CHECK (Years_of_Experience >= 0)
);
D

3. Appointment Table

CREATE TABLE Appointment (


Appointment_ID INT AUTO_INCREMENT PRIMARY KEY,
Patient_ID INT NOT NULL,
Doctor_ID INT NOT NULL,
Appointment_Date DATE NOT NULL,
Appointment_Time TIME NOT NULL,
Status ENUM('Scheduled', 'Completed', 'Canceled') DEFAULT 'Scheduled',
(continues on next page)

56
NITK, Release DB COURSE PLAN-2024-25

(continued from previous page)


Reason_for_Visit VARCHAR(255),
FOREIGN KEY (Patient_ID) REFERENCES Patient(Patient_ID),
FOREIGN KEY (Doctor_ID) REFERENCES Doctor(Doctor_ID)
);

4. Medical_Record Table

CREATE TABLE Medical_Record (


Record_ID INT AUTO_INCREMENT PRIMARY KEY,

h
Patient_ID INT NOT NULL,
Doctor_ID INT NOT NULL,
Diagnosis VARCHAR(255),

et
Treatment_Details TEXT,
Tests_Conducted TEXT,
Prescription_Details TEXT,
Record_Date DATE DEFAULT CURRENT_DATE,
FOREIGN KEY (Patient_ID) REFERENCES Patient(Patient_ID),

ar
FOREIGN KEY (Doctor_ID) REFERENCES Doctor(Doctor_ID)
);

5. Room Table

CREATE TABLE Room (


hp
Room_ID INT AUTO_INCREMENT PRIMARY KEY,
Room_Type ENUM('General', 'ICU', 'Private') NOT NULL,
Room_Status ENUM('Available', 'Occupied', 'Maintenance') DEFAULT 'Available',
Daily_Rate DECIMAL(10, 2),
Floor_Number INT CHECK (Floor_Number >= 0)
);
es

6. Hospital_Staff Table

CREATE TABLE Hospital_Staff (


Staff_ID INT AUTO_INCREMENT PRIMARY KEY,
ur

Name VARCHAR(100) NOT NULL,


Designation VARCHAR(50) NOT NULL,
Contact_Number VARCHAR(15),
Email VARCHAR(100),
Department VARCHAR(100),
Shift_Timings VARCHAR(50),
r.S

Salary DECIMAL(10, 2) CHECK (Salary >= 0)


);

7. Billing Table

CREATE TABLE Billing (


D

Bill_ID INT AUTO_INCREMENT PRIMARY KEY,


Patient_ID INT NOT NULL,
Appointment_ID INT,
Room_Charges DECIMAL(10, 2) DEFAULT 0.0,
Medication_Charges DECIMAL(10, 2) DEFAULT 0.0,
Test_Charges DECIMAL(10, 2) DEFAULT 0.0,
Doctor_Fees DECIMAL(10, 2) DEFAULT 0.0,
Total_Amount DECIMAL(10, 2) GENERATED ALWAYS AS
(Room_Charges + Medication_Charges + Test_Charges + Doctor_Fees) STORED,
Payment_Status ENUM('Paid', 'Unpaid') DEFAULT 'Unpaid',
(continues on next page)

57
NITK, Release DB COURSE PLAN-2024-25

(continued from previous page)


Payment_Method ENUM('Cash', 'Card', 'Insurance') DEFAULT 'Cash',
FOREIGN KEY (Patient_ID) REFERENCES Patient(Patient_ID),
FOREIGN KEY (Appointment_ID) REFERENCES Appointment(Appointment_ID)
);

8. Medication Table

CREATE TABLE Medication (


Medication_ID INT AUTO_INCREMENT PRIMARY KEY,

h
Name VARCHAR(100) NOT NULL,
Description TEXT,
Dosage VARCHAR(50),

et
Manufacturer VARCHAR(100),
Expiry_Date DATE,
Price DECIMAL(10, 2) CHECK (Price >= 0)
);

ar
9. Pharmacy Table

CREATE TABLE Pharmacy (


Pharmacy_ID INT AUTO_INCREMENT PRIMARY KEY,
Name VARCHAR(100) NOT NULL,
Address VARCHAR(255),
Contact_Number VARCHAR(15),
Email VARCHAR(100),
hp
Opening_Hours VARCHAR(50)
);
es

15.1 Sample Data Insertion

This section provides sample data for each table in the Health Management System.
ur

1. Insert into Patient Table

INSERT INTO Patient (Name, Age, Gender, Address, Contact_Number, Email, Emergency_
,→Contact, Insurance_ID, Registration_Date)

VALUES
r.S

('John Doe', 45, 'Male', '123 Main St', '1234567890', '[email protected]',


,→'9876543210', 1, '2025-01-01'),

('Jane Smith', 34, 'Female', '456 Elm St', '0987654321', '[email protected]',


,→'1234509876', NULL, '2025-01-02');

2. Insert into Doctor Table


D

INSERT INTO Doctor (Name, Specialization, Qualification, Contact_Number, Email,␣


,→Availability, Years_of_Experience)

VALUES
('Dr. Alice Brown', 'Cardiology', 'MD', '1122334455', '[email protected]', 'Mon-
,→Fri 9AM-5PM', 15),

('Dr. Bob White', 'Neurology', 'PhD', '2233445566', '[email protected]', 'Tue-Thu␣


,→10AM-4PM', 10);

3. Insert into Appointment Table

15.1. Sample Data Insertion 58


NITK, Release DB COURSE PLAN-2024-25

INSERT INTO Appointment (Patient_ID, Doctor_ID, Appointment_Date, Appointment_Time,␣


,→Status, Reason_for_Visit)

VALUES
(1, 1, '2025-01-05', '10:30:00', 'Scheduled', 'Routine Checkup'),
(2, 2, '2025-01-06', '11:00:00', 'Scheduled', 'Migraine Consultation');

4. Insert into Medical_Record Table

INSERT INTO Medical_Record (Patient_ID, Doctor_ID, Diagnosis, Treatment_Details,␣


,→Tests_Conducted, Prescription_Details, Record_Date)

h
VALUES
(1, 1, 'Hypertension', 'Monitor BP daily, reduce salt intake', 'Blood Pressure Test',
,→'Losartan 50mg daily', '2025-01-05'),

et
(2, 2, 'Migraine', 'Avoid triggers, prescribed medication', 'MRI Scan', 'Sumatriptan␣
,→25mg as needed', '2025-01-06');

5. Insert into Room Table

ar
INSERT INTO Room (Room_Type, Room_Status, Daily_Rate, Floor_Number)
VALUES
('ICU', 'Available', 3000.00, 2),
('Private', 'Occupied', 2000.00, 3);

6. Insert into Hospital_Staff Table


hp
INSERT INTO Hospital_Staff (Name, Designation, Contact_Number, Email, Department,␣
,→Shift_Timings, Salary)

VALUES
('Mary Johnson', 'Nurse', '3344556677', '[email protected]', 'ICU', 'Night␣
es
,→Shift', 50000.00),

('Peter Clark', 'Technician', '4455667788', '[email protected]', 'Radiology',


,→'Day Shift', 40000.00);

7. Insert into Billing Table


ur

INSERT INTO Billing (Patient_ID, Appointment_ID, Room_Charges, Medication_Charges,␣


,→Test_Charges, Doctor_Fees, Payment_Status, Payment_Method)

VALUES
(1, 1, 6000.00, 200.00, 500.00, 1000.00, 'Paid', 'Card'),
(2, 2, 0.00, 300.00, 800.00, 1500.00, 'Unpaid', 'Cash');
r.S

8. Insert into Medication Table

INSERT INTO Medication (Name, Description, Dosage, Manufacturer, Expiry_Date, Price)


VALUES
('Losartan', 'Antihypertensive', '50mg daily', 'Pfizer', '2026-12-31', 1.50),
('Sumatriptan', 'Anti-migraine', '25mg as needed', 'GSK', '2026-06-30', 2.50);
D

9. Insert into Pharmacy Table

INSERT INTO Pharmacy (Name, Address, Contact_Number, Email, Opening_Hours)


VALUES
('Central Pharmacy', '789 Oak St', '5566778899', '[email protected]', '8AM-8PM'),
('HealthMart', '101 Maple St', '6677889900', '[email protected]', '9AM-9PM');

15.1. Sample Data Insertion 59


16

h
AI-Powered Recruitment System

et
ar
Problem Description:
The AI-Powered Recruitment System addresses the inefficiencies and biases in traditional
recruitment processes by leveraging artificial intelligence and predictive analytics. It pro-
vides a centralized platform for candidates and recruiters to interact effectively, streamlining
hp
the process of matching candidates with suitable jobs.
Design a database system for an AI-driven recruitment platform named AIDRecruite. The
system should manage candidates, job postings, applications, interviews, recruiters, and AI-
based job matching. The database should support:
1. Storing candidate details including multi-valued skills.
es
2. Managing job postings with multi-valued required skills.
3. Handling applications and tracking their statuses.
4. Scheduling and recording interviews.
ur

5. Assigning recruiters to job postings.


6. AI-based matching of candidates to jobs based on skills and experience.
Key Features:
1. Candidate Management
r.S

• Candidates can: - Register and create profiles. - Upload resumes and update skills.
- Track their applications.
2. Job Management
• Recruiters can: - Post job openings with detailed descriptions. - Define required
D

skills, experience, and salary ranges.


3. AI-Powered Matching
• AI algorithms analyze: - Candidate profiles. - Job descriptions.
• Generates a match score based on: - Skills. - Experience. - Job requirements.
4. Application Tracking
• Tracks: - Application statuses (e.g., applied, shortlisted, rejected, hired).
• Ensures transparency for candidates and recruiters.

60
NITK, Release DB COURSE PLAN-2024-25

5. Interview Management
• Schedules and tracks: - Interviews for candidates.
• Records: - Feedback and outcomes for each round.
6. Recruiter Dashboard
• Offers tools to: - Manage job postings. - Review candidate matches. - Track appli-
cation statuses.
• Provides insights using analytics.

h
7. Candidate Dashboard
• Allows candidates to: - View job recommendations. - Track application statuses. -

et
Receive notifications.
Challenges Addressed:
1. Inefficient Matching - Reduces manual effort in screening candidates. - Identifies the
best matches quickly.

ar
2. Bias in Recruitment - Focuses on skills and experience. - Reduces subjective decision-
making.
3. Time-Consuming Processes - Automates screening and shortlisting. - Saves time for
hp
recruiters and candidates.
4. Application Overload - Manages large volumes of applications. - Ranks candidates
effectively.
Objective:
To design and implement an AI-Powered Recruitment System that:
es
1. Enhances Hiring Efficiency - Matches candidates to jobs with high accuracy using AI.
2. Improves Candidate Experience - Offers tailored job recommendations and real-time
tracking.
ur

3. Supports Data-Driven Recruitment - Provides predictive insights for better hiring


decisions.
Entities and Relationships:
The system is structured around the following entities and their relationships:
r.S

• Candidate: Represents job seekers and their profiles.


• Job: Represents job openings posted by recruiters.
• Application: Tracks applications submitted by candidates.
• Interview: Manages interview scheduling and feedback.
D

• Recruiter: Represents hiring personnel managing job postings.


• AI_Matching: Links candidates and jobs with a match score.
• Job_Assignment: Links Jobs and Recruiters by assigning recruiters to manage specific
job postings.
Entity-Relationship Diagram:
The following diagram represents the relationship between Candidate, AI_Matching, and
Job entities.

61
NITK, Release DB COURSE PLAN-2024-25

+-------------+ +-------------+ +-------------+


| Candidate | | AI_Matching | | Job |
|-------------| |-------------| |-------------|
| Candidate_ID| 1 N | Match_ID | N 1 | Job_ID |
| Name |----------| Candidate_ID|----------| Title |
| Skills | Matches| Job_ID | Matches| Skills |
+-------------+ | Match_Score | +-------------+
+-------------+

Explanation:

h
• Candidate Table: Stores details about job candidates (Candidate_ID, Name, Skills).
• Job Table: Stores job listings (Job_ID, Title, Skills required).

et
• AI_Matching Table: Represents the many-to-many relationship between Candidate and
Job.
– Match_ID: Unique identifier for each match.

ar
– Candidate_ID: Foreign key referencing Candidate.
– Job_ID: Foreign key referencing Job.
– Match_Score: AI-generated score indicating suitability.
Relationships:
hp
• One Candidate can match with multiple Jobs (1:N).
• One Job can match with multiple Candidates (N:1).
• AI_Matching serves as a bridge table for many-to-many relationships.
es
This structured model ensures efficient job-candidate matching using AI-based scoring.

+-------------+ +-----------------+ +-------------+


| Candidate | | Application | | Job |
|-------------| |-----------------| |-------------|
ur

| Candidate_ID| 1 N | Application_ID | N 1 | Job_ID |


| Name |----------| Candidate_ID |----------| Title |
| Skills | Applies| Job_ID | Matches| Skills |
+-------------+ | Application_Date| +-------------+
| Status |
+-----------------+
r.S

Explanation:
• Candidate Table: Stores details about job candidates, such as Candidate_ID, Name,
and Skills.
• Job Table: Stores job listings, including Job_ID, Title, and the required Skills.
D

• Application Table: Represents the relationship between candidates and jobs in the
application process.
– Application_ID: Unique identifier for each application.
– Candidate_ID: Foreign key referencing the Candidate table, indicating which can-
didate submitted the application.
– Job_ID: Foreign key referencing the Job table, indicating which job the candidate
applied for.

62
NITK, Release DB COURSE PLAN-2024-25

– Application_Date: The date the application was submitted.


– Status: The current status of the application (e.g., pending, accepted, rejected).
Relationships:
• One Candidate can apply for multiple Jobs (1:N).
• One Job can have multiple Applications (N:1), representing various candidates ap-
plying for the same position.
• The Application Table serves as a bridge for the many-to-many relationship between

h
Candidates and Jobs, with additional details like application status and submission date.
Job-Recruiter Entity-Relationship (ER) Diagram

et
The following ER diagram illustrates the relationship between Job and Recruiter entities
using an associative entity called Job_Assignment to handle the many-to-many relationship.
ER Diagram:

ar
+-------------+ +----------------+ +-------------+
| Job | | Job_Assignment | | Recruiter |
|-------------| |----------------| |-------------|
| Job_ID | 1 N | Assignment_ID | N 1 | Recruiter_ID|
| Title |----------| Job_ID |----------| Name |
| Skills | Assigned | Recruiter_ID | Manages | Email
hp |
| Location | To | Assigned_Date | Jobs | Phone |
+-------------+ +----------------+ +-------------+

Entity Descriptions:
• Job
es
– Job_ID (Primary Key)
– Title: The title of the job position.
– Skills: Required skills for the job.
ur

– Location: Job location.


• Recruiter
– Recruiter_ID (Primary Key)
– Name: Name of the recruiter.
r.S

– Email: Recruiter’s email address.


– Phone: Contact number of the recruiter.
• Job_Assignment
– Assignment_ID (Primary Key)
D

– Job_ID (Foreign Key referencing Job)


– Recruiter_ID (Foreign Key referencing Recruiter)
– Assigned_Date: The date when the job was assigned to the recruiter.
Relationships:
• One Job can be assigned to multiple Recruiters.
• One Recruiter can manage multiple Jobs.

63
NITK, Release DB COURSE PLAN-2024-25

• The Job_Assignment table represents this many-to-many relationship.


AI-Driven Recruitment System ER Diagram:
The following Entity-Relationship (ER) Diagram illustrates the relationships between key
entities in the recruitment system:

+-----------------+ +-------------------+ +-------------------+


| Candidate | | Application | | Interview |
|-----------------| |-------------------| |-------------------|
| Candidate_ID(PK)| 1 N | Application_ID(PK)| N 1 | Interview_ID(PK) |

h
| Name |----------| Candidate_ID(FK) |----------| Application_ID(FK)|
| Email | Applies | Job_ID(FK) | Applied | Interview_Date |
| Phone | For | Status | To | Interviewer |

et
| Skills | | Application_Date | | Feedback |
| Experience | +-------------------+ | Outcome |
+-----------------+ | N +-------------------+
M | a (1) |
c | h |

ar
e | s |
| (N) | 1
+-----------------+ +-----------------+ +-------------------+
| AI_Matching | | Job | | Job_Assignment |
|-----------------| |-----------------| |-------------------|
| Match_ID(PK) | N 1 | Job_ID(PK)
hp | 1 N | Assignment_ID(PK) |
| Candidate_ID(FK)|-----------| Title |----------| Job_ID(FK) |
| Job_ID (FK) | Matches | Skills | Assigned | Recruiter_ID(FK) |
| Match_Score | | Location | | Assigned_Date |
+-----------------+ | Salary | +-------------------+
+-----------------+ M | a (N)
n | a
es
g | e
s | (1)
+-----------------+
| Recruiter |
|-----------------|
ur

| Recruiter_ID(PK)|
| Name |
| Email |
| Phone |
+-----------------+
r.S

Entity Descriptions:
• Candidate
– Candidate_ID (Primary Key)
– Name: Full name of the candidate.
D

– Email: Candidate’s email address.


– Phone: Contact number of the candidate.
– Skills: List of candidate skills.
– Experience: Years of professional experience.
• Application
– Application_ID (Primary Key)
– Candidate_ID (Foreign Key referencing Candidate)

64
NITK, Release DB COURSE PLAN-2024-25

– Job_ID (Foreign Key referencing Job)


– Status: Application status (e.g., Applied, Shortlisted, Hired).
– Application_Date: Date of application submission.
• Interview
– Interview_ID (Primary Key)
– Application_ID (Foreign Key referencing Application)
– Interview_Date: Scheduled date of the interview.

h
– Interviewer: Name of the interviewer.
– Feedback: Feedback from the interview.

et
– Outcome: Interview outcome (e.g., Pass, Fail).
• AI_Matching
– Match_ID (Primary Key)

ar
– Candidate_ID (Foreign Key referencing Candidate)
– Job_ID (Foreign Key referencing Job)
– Match_Score: Score representing the match strength between candidate and job.
hp
• Job
– Job_ID (Primary Key)
– Title: Job title.
– Skills: Required skills for the job.
es
– Location: Job location.
– Salary: Offered salary for the job.
• Job_Assignment
ur

– Assignment_ID (Primary Key)


– Job_ID (Foreign Key referencing Job)
– Recruiter_ID (Foreign Key referencing Recruiter)
r.S

– Assigned_Date: Date when the recruiter was assigned to the job.


• Recruiter
– Recruiter_ID (Primary Key)
– Name: Full name of the recruiter.
– Email: Recruiter’s email address.
D

– Phone: Contact number of the recruiter.


Relationships:
• Candidate ↔ Application: A candidate can submit multiple applications (1:N).
• Application ↔ Job: Each application is for a specific job (N:1).
• Application ↔ Interview: An application can lead to multiple interviews (1:N).

65
NITK, Release DB COURSE PLAN-2024-25

• Candidate ↔ AI_Matching: A candidate can have multiple matches with different jobs
(1:N).
• Job ↔ AI_Matching: A job can have multiple candidate matches (1:N).
• Job ↔ Job_Assignment: A job can be assigned to multiple recruiters (1:N).
• Recruiter ↔ Job_Assignment: A recruiter can manage multiple jobs (1:N).
SQL Schema:

-- Job Table

h
CREATE TABLE Job (
Job_ID INT PRIMARY KEY,
Title VARCHAR(100),

et
Skills VARCHAR(255),
Location VARCHAR(100)
);

-- Recruiter Table

ar
CREATE TABLE Recruiter (
Recruiter_ID INT PRIMARY KEY,
Name VARCHAR(100),
Email VARCHAR(100),
Phone VARCHAR(15) hp
);

-- Job_Assignment Table
CREATE TABLE Job_Assignment (
Assignment_ID INT PRIMARY KEY,
Job_ID INT,
Recruiter_ID INT,
es
Assigned_Date DATE,
FOREIGN KEY (Job_ID) REFERENCES Job(Job_ID),
FOREIGN KEY (Recruiter_ID) REFERENCES Recruiter(Recruiter_ID)
);
ur

Sample Queries:
• Find Jobs Without a Recruiter:

SELECT J.*
FROM Job J
r.S

LEFT JOIN Job_Assignment JA ON J.Job_ID = JA.Job_ID


WHERE JA.Recruiter_ID IS NULL;

• Find Recruiters Without Assigned Jobs:

SELECT R.*
FROM Recruiter R
D

LEFT JOIN Job_Assignment JA ON R.Recruiter_ID = JA.Recruiter_ID


WHERE JA.Job_ID IS NULL;

Example Workflow:
1. Candidate A uploads their resume with skills:
• Python
• Machine Learning
• Data Analysis

66
NITK, Release DB COURSE PLAN-2024-25

2. Job X requires skills:


• Python
• SQL
• Machine Learning
3. The system performs the following steps:
a. Preprocesses and extracts the skills from the resume and job description.
b. Matches Candidate A’s skills with Job X, calculating a Match_Score of 85%.

h
4. Outcome:
• Candidate A is ranked #1 for Job X.

et
• Candidate A is recommended to the recruiter.
5. Next Steps:
• The recruiter shortlists Candidate A.

ar
• Schedules an interview.
• Records feedback after the interview.
Database Schema: hp
CREATE DATABASE AIDRecruite;
USE AIDRecruite;

Candidate Table:
es
CREATE TABLE Candidate (
Candidate_ID INT AUTO_INCREMENT PRIMARY KEY,
Name VARCHAR(255) NOT NULL,
Email VARCHAR(255) UNIQUE NOT NULL,
Phone VARCHAR(20) UNIQUE NOT NULL,
Address TEXT,
ur

Resume_Link VARCHAR(255),
Experience_Years INT,
Profile_Creation_Date DATETIME DEFAULT CURRENT_TIMESTAMP
);
r.S

Candidate Skills Table (Multi-valued Attribute):

CREATE TABLE Candidate_Skills (


Candidate_ID INT,
Skill VARCHAR(100),
PRIMARY KEY (Candidate_ID, Skill),
FOREIGN KEY (Candidate_ID) REFERENCES Candidate(Candidate_ID) ON DELETE CASCADE
D

);

Job Table:

CREATE TABLE Job (


Job_ID INT AUTO_INCREMENT PRIMARY KEY,
Title VARCHAR(255) NOT NULL,
Description TEXT,
Experience_Required INT,
Location VARCHAR(255),
(continues on next page)

67
NITK, Release DB COURSE PLAN-2024-25

(continued from previous page)


Salary_Range VARCHAR(100),
Posting_Date DATETIME DEFAULT CURRENT_TIMESTAMP
);

Job Required Skills Table:

CREATE TABLE Job_Required_Skills (


Job_ID INT,
Skill VARCHAR(100),

h
PRIMARY KEY (Job_ID, Skill),
FOREIGN KEY (Job_ID) REFERENCES Job(Job_ID) ON DELETE CASCADE
);

et
Application Table:

CREATE TABLE Application (


Application_ID INT AUTO_INCREMENT PRIMARY KEY,

ar
Candidate_ID INT,
Job_ID INT,
Application_Status ENUM('Applied', 'Shortlisted', 'Rejected', 'Hired') DEFAULT
,→'Applied',

Application_Date DATETIME DEFAULT CURRENT_TIMESTAMP,

);
hp
FOREIGN KEY (Candidate_ID) REFERENCES Candidate(Candidate_ID) ON DELETE CASCADE,
FOREIGN KEY (Job_ID) REFERENCES Job(Job_ID) ON DELETE CASCADE

Interview Table:

CREATE TABLE Interview (


es
Interview_ID INT AUTO_INCREMENT PRIMARY KEY,
Application_ID INT,
Interview_Date DATETIME NOT NULL,
Interviewer_Name VARCHAR(255) NOT NULL,
Feedback TEXT,
ur

Outcome ENUM('Pass', 'Fail') DEFAULT NULL,


FOREIGN KEY (Application_ID) REFERENCES Application(Application_ID) ON DELETE␣
,→CASCADE

);

Recruiter Table:
r.S

CREATE TABLE Recruiter (


Recruiter_ID INT AUTO_INCREMENT PRIMARY KEY,
Name VARCHAR(255) NOT NULL,
Email VARCHAR(255) UNIQUE NOT NULL,
Phone VARCHAR(20) UNIQUE NOT NULL
D

);

Recruiter Assigned Jobs Table:

CREATE TABLE Recruiter_Assigned_Jobs (


Recruiter_ID INT,
Job_ID INT,
PRIMARY KEY (Recruiter_ID, Job_ID),
FOREIGN KEY (Recruiter_ID) REFERENCES Recruiter(Recruiter_ID) ON DELETE CASCADE,
FOREIGN KEY (Job_ID) REFERENCES Job(Job_ID) ON DELETE CASCADE
);

68
NITK, Release DB COURSE PLAN-2024-25

AI Matching Table:
CREATE TABLE AI_Matching (
Match_ID INT AUTO_INCREMENT PRIMARY KEY,
Candidate_ID INT,
Job_ID INT,
Match_Score DECIMAL(5,2) CHECK (Match_Score >= 0 AND Match_Score <= 100),
FOREIGN KEY (Candidate_ID) REFERENCES Candidate(Candidate_ID) ON DELETE CASCADE,
FOREIGN KEY (Job_ID) REFERENCES Job(Job_ID) ON DELETE CASCADE
);

h
16.1 Data Insertion

et
INSERT INTO Candidate (Name, Email, Phone, Address, Resume_Link, Experience_Years)␣
,→VALUES

('John Doe', '[email protected]', '1234567890', '123 Main St', 'resume_john.pdf',␣

ar
,→5),

('Jane Smith', '[email protected]', '0987654321', '456 Elm St', 'resume_jane.pdf


,→', 3);

INSERT INTO Candidate_Skills (Candidate_ID, Skill) VALUES


hp
(1, 'Python'),
(1, 'SQL'),
(2, 'Java'),
(2, 'HTML');

INSERT INTO Job (Title, Description, Experience_Required, Location, Salary_Range)␣


es
,→VALUES

('Software Engineer', 'Develop and maintain software applications.', 3, 'New York', '
,→$70,000-$90,000'),

('Web Developer', 'Build and maintain websites.', 2, 'San Francisco', '$60,000-$80,000


,→');
ur

INSERT INTO Job_Required_Skills (Job_ID, Skill) VALUES


(1, 'Python'),
(1, 'SQL'),
(2, 'HTML'),
(2, 'CSS');
r.S

INSERT INTO Application (Candidate_ID, Job_ID, Application_Status) VALUES


(1, 1, 'Applied'),
(2, 2, 'Shortlisted');

INSERT INTO Interview (Application_ID, Interview_Date, Interviewer_Name, Feedback,␣


,→Outcome) VALUES
D

(1, '2025-02-01 10:00:00', 'Alice Johnson', 'Good technical skills.', 'Pass');

INSERT INTO Recruiter (Name, Email, Phone) VALUES


('Michael Scott', '[email protected]', '1112223333'),
('Dwight Schrute', '[email protected]', '4445556666');

INSERT INTO Recruiter_Assigned_Jobs (Recruiter_ID, Job_ID) VALUES


(1, 1),
(2, 2);

(continues on next page)

16.1. Data Insertion 69


NITK, Release DB COURSE PLAN-2024-25

(continued from previous page)


INSERT INTO AI_Matching (Candidate_ID, Job_ID, Match_Score) VALUES
(1, 1, 85.5),
(2, 2, 90.0);

h
et
ar
hp
es
ur
r.S
D

16.1. Data Insertion 70


17

h
Relational Algebra

et
ar
Illustration:
1. Only in A (not in J or C)
Query Statement:
hp
Find all applications that do not have a matching job or candidate.
Relational Algebra:

A − (J ∪ C) = πApplicationI D,CandidateI D,JobI D,ApplicationS tatus,ApplicationD ate (Application)


\(πApplicationI D (Application ▷◁ Candidate) ∪ πApplicationI D (Application ▷◁ Job))
es
2. Only in J (not in A or C):
Query Statement:
Find all jobs that are not associated with any application or candidate.
ur

Relational Algebra:

J − (A∪C) = πJobI D,T itle,Description,RequiredS kills,ExperienceR equired,Location,SalaryR ange,P ostingD ate (Job)
\(πJobI D (Job ▷◁ Application)∪πJobI D (Job ▷◁ Candidate))
r.S

3. Only in C (not in A or J):


Query Statement:
Find all candidates who have not applied for any job and are not associated with any job
directly.
Relational Algebra:
D

C − (A∪J) = πCandidateI D,N ame,Email,P hone,Address,ResumeL ink,Skills,ExperienceY ears,P rof ileC reationD ate
(Candidate) \ (πCandidateI D (Candidate ▷◁ Application)∪πCandidateI D (Candidate ▷◁ Job))

4. In A and J, but not in C:


Query Statement:
Find all applications that have a corresponding job but no corresponding candidate.

71
NITK, Release DB COURSE PLAN-2024-25

Relational Algebra:

(A∩J) − C = (πApplicationI D,CandidateI D,JobI D,ApplicationS tatus,ApplicationD ate (Application ▷◁ Job))


\πApplicationI D (Application ▷◁ Candidate)

5. In A and C, but not in J:


Query Statement:
Find all applications where a candidate exists but no corresponding job exists.

h
Relational Algebra:

(A∩C) − J = (πApplicationI D,CandidateI D,JobI D,ApplicationS tatus,ApplicationD ate (Application ▷◁ Candidate))


\πApplicationI D (Application ▷◁ Job)

et
6. In J and C, but not in A:
Query Statement:

ar
Find all jobs that have a candidate associated but no application exists for them.
Relational Algebra:

(J∩C) − A = (πJobI D,CandidateI D,T itle,Description,RequiredS kills,ExperienceR equired,Location,SalaryR ange,P ostingD ate
hp (Job ▷◁ Candidate)) \ πJobI D (Job ▷◁ Application)

7. In A, J, and C (Common to all three):


Query Statement:
Find all applications where a candidate has applied for a job, meaning there is a connection
between all three entities.
es
Relational Algebra:

A∩J∩C = πApplicationI D,CandidateI D,JobI D,ApplicationS tatus,ApplicationD ate (Application ▷◁ Job ▷◁ Candidate)

8. At least in one of A, J, or C (Entire Venn Diagram - Union):


ur

Query Statement:
Find all records that exist in at least one of the Application, Job, or Candidate tables.
Relational Algebra:
r.S

A∪J∪C = πApplicationI D,CandidateI D,JobI D,ApplicationS tatus,ApplicationD ate (Application)


∪πJobI D,T itle,Description,RequiredS kills,ExperienceR equired,Location,SalaryR ange,P ostingD ate (Job)
∪πCandidateI D,N ame,Email,P hone,Address,ResumeL ink,Skills,ExperienceY ears,P rof ileC reationD ate (Candidate)

Table 1: Relational Algebra Operations


D

Operation Formal Relational Algebra Expression


Only in A (not in J or C) A - (J ∪ C)
Only in J (not in A or C) J - (A ∪ C)
Only in C (not in A or J) C - (A ∪ J)
In A and J, but not in C (A ∩ J) - C
In A and C, but not in J (A ∩ C) - J
In J and C, but not in A (J ∩ C) - A
In A, J, and C (common to all) A∩J∩C
At least in one of A, J, or C (union) A∪J∪C

72
NITK, Release DB COURSE PLAN-2024-25

Relational Algebra Queries:


This document contains MySQL queries representing relational algebra operations for the
Candidate (C), Application (A), and Job (J) tables.

Table of Contents

• Practical scenario
• Relational Algebra Queries and SQL Equivalents

h
1. Only in A (not in J or C):

et
Relational Algebra: A - (J ∪ C)
Description: Find all applications that do not have a matching job or candidate.

SELECT * FROM Application A

ar
WHERE A.Candidate_ID NOT IN (SELECT Candidate_ID FROM Candidate)
AND A.Job_ID NOT IN (SELECT Job_ID FROM Job);

2. Only in J (not in A or C):


Relational Algebra: J - (A ∪ C) hp
Description: Find all jobs that are not associated with any application or candidate.

SELECT * FROM Job J


WHERE J.Job_ID NOT IN (SELECT Job_ID FROM Application)
es
3. Only in C (not in A or J):
Relational Algebra: C - (A ∪ J)
Description: Find all candidates who have not applied for any job and are not associated
with any job directly.
ur

SELECT * FROM Candidate C


WHERE C.Candidate_ID NOT IN (SELECT Candidate_ID FROM Application)

4. In A and J, but not in C:


r.S

Relational Algebra: (A ∩ J) - C
Description: Find all applications that have a corresponding job but no corresponding can-
didate.

SELECT *
FROM Application A
WHERE A.Job_ID IN (SELECT Job_ID FROM Job)
D

AND A.Candidate_ID NOT IN (SELECT Candidate_ID FROM Candidate);

5. In A and C, but not in J:


Relational Algebra: (A ∩ C) - J
Description: Find all applications where a candidate exists but no corresponding job exists.

73
NITK, Release DB COURSE PLAN-2024-25

SELECT *
FROM Application A
WHERE A.Candidate_ID IN (SELECT Candidate_ID FROM Candidate)
AND A.Job_ID NOT IN (SELECT Job_ID FROM Job);

6. In J and C, but not in A:


Relational Algebra: (J ∩ C) - A
Description: Find all jobs that have a candidate associated but no application exists for them.

h
SELECT *
FROM Job J
WHERE J.Job_ID NOT IN (

et
SELECT Job_ID FROM Application)

AND J.Job_ID IN (
SELECT A.Job_ID
FROM Application A

ar
WHERE A.Candidate_ID IN (SELECT Candidate_ID FROM Candidate));

Or equivalently

SELECT * hp
FROM Job J
WHERE EXISTS (
SELECT 1 FROM Application A
WHERE A.Job_ID = J.Job_ID
AND A.Candidate_ID IN (SELECT Candidate_ID FROM Candidate)
)
es
AND NOT EXISTS (
SELECT 1 FROM Application A
WHERE A.Job_ID = J.Job_ID
);

7. In A, J, and C (Common to all three):


ur

Relational Algebra: A ∩ J ∩ C
Description: Find all applications where a candidate has applied for a job, meaning there is
a connection between all three entities.
r.S

SELECT A.*
FROM Application A
JOIN Job J ON A.Job_ID = J.Job_ID
JOIN Candidate C ON A.Candidate_ID = C.Candidate_ID;

Or equivalently:
D

SELECT *
FROM Application A, Job J, Candidate C
WHERE A.Job_ID = J.Job_ID
AND A.Candidate_ID = C.Candidate_ID;

8. At least in one of A, J, or C (Union):

74
NITK, Release DB COURSE PLAN-2024-25

17.1 Practical scenario

Relational Algebra: A ∪ J ∪ C
Description: Find all records that exist in at least one of the Application, Job, or Candidate
tables.

SELECT Candidate_ID, NULL AS Job_ID, NULL AS Application_ID, Name AS Entity_Name,


,→'Candidate' AS Entity_Type

FROM Candidate

h
UNION
SELECT NULL, Job_ID, NULL, Title, 'Job'
FROM Job

et
UNION
SELECT Candidate_ID, Job_ID, Application_ID, Application_Status, 'Application'
FROM Application;

In real-world terms, (J ∩ C) - A means:

ar
Find all jobs where a candidate is linked, but no actual application has been submitted.
Breaking it Down
(J ∩ C) → Jobs with associated candidates hp
• This means there is some logical or inferred connection between jobs and candidates.
• However, this doesn’t necessarily mean an application was submitted.
• This could be from AI-based matching, recruiter assignments, or candidate-job recom-
mendations.
es
- A → Remove jobs where applications exist
• If an actual application exists for a job, remove that job from the result.
• The remaining jobs are those where a candidate is associated but hasn’t applied.
ur

Real-Life Example:
Scenario:
A company has a job posting for a Software Engineer. They use an AI system that predicts a
match between candidates and jobs based on skills and experience.
r.S

Possible Situations:

Job Candidate Application Submit- Included in (J ∩ C) -


Matched? (J ∩ ted? (A) A?
C)
D

Software Engi- Yes (AI-matched) No Yes


neer
Data Scientist Yes (Recruiter Yes No
Shortlisted)
Product Man- No No No
ager

17.1. Practical scenario 75


NITK, Release DB COURSE PLAN-2024-25

17.2 Relational Algebra Queries and SQL Equivalents

This document provides relational algebra expressions and their equivalent SQL queries for
different database operations involving Application (A), Job (J), and Candidate (C).
Each section includes:
• Relational Algebra Expression
• SQL Query

h
• Example Scenario
• Tabular Representation of Example Data

et
1. Only in A (not in J or C)
Relational Algebra:

A − (J∪C)

ar
Description:
Find all applications that do not have a matching job or candidate.

SELECT * FROM Application A


hp
WHERE A.Candidate_ID NOT IN (SELECT Candidate_ID FROM Candidate)
AND A.Job_ID NOT IN (SELECT Job_ID FROM Job);

Example Scenario:
A job application was submitted for a job that does not exist in the system or for a candidate
es
who is not registered.

Application_ID Candidate_ID Job_ID Status


101 NULL NULL Pending
ur

2. Only in J (not in A or C)
Relational Algebra:

J − (A∪C)
r.S

Description:
Find all jobs that are not associated with any application or candidate.

SELECT * FROM Job J


D

WHERE J.Job_ID NOT IN (SELECT Job_ID FROM Application);

Example Scenario:
A company has posted job listings, but no one has applied, and no candidate is associated.

Job_ID Title Applications?


301 DevOps Engineer No
302 Product Designer No

17.2. Relational Algebra Queries and SQL Equivalents 76


NITK, Release DB COURSE PLAN-2024-25

3. Applications where a candidate exists but no corresponding job


Relational Algebra:

(A∩C) − J

Description:
Find all applications where a candidate exists but no corresponding job exists.
SELECT *

h
FROM Application A
WHERE A.Candidate_ID IN (SELECT Candidate_ID FROM Candidate)
AND A.Job_ID NOT IN (SELECT Job_ID FROM Job);

et
Example Scenario:
A candidate submitted an application but the job was deleted.

ar
Application_ID Candidate_ID Job_ID Status
102 201 NULL Submitted

4. Applications that have a corresponding job but no corresponding candidate


Relational Algebra:
hp
(A∩J) − C

Description:
Find all applications that have a corresponding job but no corresponding candidate.
es
SELECT *
FROM Application A
WHERE A.Job_ID IN (SELECT Job_ID FROM Job)
AND A.Candidate_ID NOT IN (SELECT Candidate_ID FROM Candidate);
ur

Example Scenario:
A job application was created but the candidate was deleted from the system.
r.S

Application_ID Candidate_ID Job_ID Status


103 NULL 301 Submitted

5. Jobs that have a candidate associated but no application exists


Relational Algebra:
D

(J∩C) − A

Description:
Find all jobs where a candidate is linked, but no actual application has been submitted.
SELECT *
FROM Job J
WHERE J.Job_ID IN (SELECT A.Job_ID FROM Application A WHERE A.Candidate_ID IN (SELECT␣
,→Candidate_ID FROM Candidate))

AND J.Job_ID NOT IN (SELECT Job_ID FROM Application);

17.2. Relational Algebra Queries and SQL Equivalents 77


NITK, Release DB COURSE PLAN-2024-25

Example Scenario:
A candidate was recommended for a job by an AI system or recruiter but did not submit
an application.

Job_ID Title Candidate Matched?


401 Data Analyst Yes (AI-matched)
402 UX Designer Yes (Recruiter Assigned)

h
6. Find all records existing in at least one of the three tables
Relational Algebra:

et
A∪J∪C

Description:
Find all records that exist in at least one of the Application, Job, or Candidate tables.

ar
SELECT Candidate_ID, NULL AS Job_ID, NULL AS Application_ID, Name AS Entity_Name,
,→'Candidate' AS Entity_Type

FROM Candidate
UNION hp
SELECT NULL, Job_ID, NULL, Title, 'Job'
FROM Job
UNION
SELECT Candidate_ID, Job_ID, Application_ID, Application_Status, 'Application'
FROM Application;
es
Example Scenario:
List all records that exist in any of the three tables.

Candidate_ID Job_ID Application_ID Entity Type


ur

201 NULL NULL Candidate


NULL 301 NULL Job
202 302 104 Application

7. Find all applications where a candidate has applied for a job


r.S

Relational Algebra:

A∩J∩C

Description:
D

Find all applications where a candidate has applied for a job, meaning there is a connection
between all three entities.

SELECT A.*
FROM Application A
JOIN Job J ON A.Job_ID = J.Job_ID
JOIN Candidate C ON A.Candidate_ID = C.Candidate_ID;

Example Scenario:
A valid application exists where both a job and a candidate are present.

17.2. Relational Algebra Queries and SQL Equivalents 78


NITK, Release DB COURSE PLAN-2024-25

Application_ID Candidate_ID Job_ID Status


105 203 303 Approved

h
et
ar
hp
es
ur
r.S
D

17.2. Relational Algebra Queries and SQL Equivalents 79


18

h
Cross Product and Join in RDBMS

et
ar
In Relational Database Management Systems (RDBMS), cross product and joins define
how tables are combined. Below is a detailed explanation of each, with examples.
Cross Product (Cartesian Product):
Definition: hp
A cross product (Cartesian Product) is the combination of every row from the
first table with every row from the second table. It generates a result set that
has m × n rows, where:
• m = number of rows in the first table
es
• n = number of rows in the second table
SQL Syntax::

SELECT * FROM Table1, Table2;


ur

-- OR using explicit CROSS JOIN


SELECT * FROM Table1 CROSS JOIN Table2;

Example:
Table: Employees
r.S

emp_id emp_name
12 Alice Bob

Table: Departments
D

dept_id dept_name
10 20 HR IT

Cross Product Result:

emp_id emp_name dept_id dept_name


1122 Alice Alice Bob Bob 10 20 10 20 HR IT HR IT

80
NITK, Release DB COURSE PLAN-2024-25

Note About Cross Product:


• Produces a very large result set if both tables have many rows.
• Not commonly used directly, usually filtered with a condition.
• Forms the basis for INNER JOIN or JOIN conditions.
Joins in RDBMS:
A join is used to combine records from two or more tables based on a related column.
Types of Joins::

h
1. INNER JOIN
2. OUTER JOIN

et
• LEFT JOIN (LEFT OUTER JOIN)
• RIGHT JOIN (RIGHT OUTER JOIN)
• FULL JOIN (FULL OUTER JOIN)

ar
3. SELF JOIN
4. CROSS JOIN (Same as Cross Product)
INNER JOIN: hp
An INNER JOIN returns only the matching rows from both tables where there is a common
value.
SQL Syntax::

SELECT * FROM Table1


es
INNER JOIN Table2
ON Table1.common_column = Table2.common_column;

Example:
Table: emp_dept
ur

emp_id dept_id
12 10 20
r.S

INNER JOIN Query::

SELECT e.emp_id, e.emp_name, d.dept_id, d.dept_name


FROM Employees e
INNER JOIN emp_dept ed ON e.emp_id = ed.emp_id
INNER JOIN Departments d ON ed.dept_id = d.dept_id;
D

Result:

emp_id emp_name dept_id dept_name


12 Alice Bob 10 20 HR IT

OUTER JOINs:
LEFT JOIN (LEFT OUTER JOIN)

81
NITK, Release DB COURSE PLAN-2024-25

• Returns all rows from the left table and matching rows from the right table.
• If there is no match, NULL is returned for the right table.
SQL Syntax::
SELECT e.emp_id, e.emp_name, d.dept_id, d.dept_name
FROM Employees e
LEFT JOIN emp_dept ed ON e.emp_id = ed.emp_id
LEFT JOIN Departments d ON ed.dept_id = d.dept_id;

h
Example Result:

emp_id emp_name dept_id dept_name

et
123 Alice Bob Charlie 10 20 NULL HR IT NULL

RIGHT JOIN (RIGHT OUTER JOIN)

ar
• Returns all rows from the right table and matching rows from the left table.
• If there is no match, NULL is returned for the left table.
FULL JOIN (FULL OUTER JOIN)
hp
• Returns all records from both tables, with NULL where there is no match.
SELF JOIN:
A SELF JOIN is when a table is joined with itself.
SQL Syntax::
es
SELECT e1.emp_name AS Employee, e2.emp_name AS Manager
FROM Employees e1
JOIN Employees e2 ON e1.manager_id = e2.emp_id;

CROSS JOIN (Same as Cross Product):


ur

• Returns every combination of rows from both tables.


• Produces m × n results.
SQL Syntax::
r.S

SELECT * FROM Employees CROSS JOIN Departments;

Example Output:

emp_id emp_name dept_id dept_name


1122 Alice Alice Bob Bob 10 20 10 20 HR IT HR IT
D

Key Differences Between Cross Product and Join:

Feature Cross Product (Cartesian Prod- JOIN (Inner, Outer, etc.)


uct)
Definition Out- Combines all rows from both Combines rows based on a condition
put Size Use tables m × n rows Rarely used Smaller, only relevant matches Used
Case Perfor- alone Slow due to large result frequently in databases Optimized with
mance sets indexes

82
NITK, Release DB COURSE PLAN-2024-25

• Cross Product gives all possible combinations of rows.


• Joins filter meaningful relationships between tables.
• INNER JOIN keeps only matches, while OUTER JOINs retain unmatched rows.

h
et
ar
hp
es
ur
r.S
D

83
19

h
JOIN Operations in AI Matching Recruitment System

et
ar
This document explains the relevance of JOIN operations in the AI-Driven Recruitment
System, focusing on how they are used to manage candidate applications, AI matching, job
assignments, and recruiter interactions.
JOIN Types Overview: hp
• INNER JOIN: Returns records with matching values in both tables.
• LEFT JOIN (LEFT OUTER JOIN): Returns all records from the left table and matched
records from the right table; NULL if no match exists.
• RIGHT JOIN (RIGHT OUTER JOIN): Returns all records from the right table and
es
matched records from the left table; NULL if no match exists.
• FULL JOIN (FULL OUTER JOIN): Returns all records when there is a match in either
left or right table.
• SELF JOIN: Joins a table with itself to compare rows within the same table.
ur

• CROSS JOIN: Returns the Cartesian product of both tables.


INNER JOIN:
Definition: Returns records that have matching values in both tables.
Relevance:
r.S

• To find candidates who have applied for jobs.


• To match candidates with jobs using AI Matching scores.
• To get interview details for specific applications.
Example 1: Find All Candidates with Their Job Applications
D

SELECT C.Name, J.Title, A.Status


FROM Candidate C
INNER JOIN Application A ON C.Candidate_ID = A.Candidate_ID
INNER JOIN Job J ON A.Job_ID = J.Job_ID;

Example 2: Retrieve AI-Matched Candidates for a Specific Job

84
NITK, Release DB COURSE PLAN-2024-25

SELECT C.Name, J.Title, AM.Match_Score


FROM AI_Matching AM
INNER JOIN Candidate C ON AM.Candidate_ID = C.Candidate_ID
INNER JOIN Job J ON AM.Job_ID = J.Job_ID
WHERE J.Job_ID = 'J1';

LEFT JOIN (LEFT OUTER JOIN):


Definition: Returns all records from the left table, and matched records from the right table.
Returns NULL if no match exists.

h
Relevance:
• To find candidates who haven’t applied for any jobs.

et
• To identify jobs with no assigned recruiters.
• To detect applications without any interviews scheduled.
Example 1: Find Candidates Who Have NOT Applied for Any Job

ar
SELECT C.Name
FROM Candidate C
LEFT JOIN Application A ON C.Candidate_ID = A.Candidate_ID
WHERE A.Application_ID IS NULL; hp
Example 2: Find Jobs WITHOUT Assigned Recruiters

SELECT J.Title
FROM Job J
LEFT JOIN Job_Assignment JA ON J.Job_ID = JA.Job_ID
WHERE JA.Recruiter_ID IS NULL;
es

RIGHT JOIN (RIGHT OUTER JOIN):


Definition: Returns all records from the right table, and matched records from the left table.
Returns NULL if no match exists.
ur

Relevance: - To find recruiters who are not assigned to any jobs. - To identify unmatched AI
records (e.g., jobs without matched candidates).
Example 1: Find Recruiters WITHOUT Any Job Assignments
r.S

SELECT R.Name
FROM Job_Assignment JA
RIGHT JOIN Recruiter R ON JA.Recruiter_ID = R.Recruiter_ID
WHERE JA.Job_ID IS NULL;

FULL JOIN (FULL OUTER JOIN):


D

Definition: Returns all records when there is a match in either left or right table. Records
without matches will have NULL values.
> Note: MySQL doesn’t support FULL JOIN directly. Use UNION with LEFT JOIN and RIGHT
JOIN.
Relevance: - To compare all candidates and applications, including those without any con-
nections. - To find jobs and recruiters, including those with or without job assignments.
Example 1: List All Candidates and Their Applications (Even If No Match Exists)

85
NITK, Release DB COURSE PLAN-2024-25

SELECT C.Name, A.Application_ID


FROM Candidate C
FULL JOIN Application A ON C.Candidate_ID = A.Candidate_ID;

Alternative for MySQL (Using UNION):

SELECT C.Name, A.Application_ID


FROM Candidate C
LEFT JOIN Application A ON C.Candidate_ID = A.Candidate_ID
UNION

h
SELECT C.Name, A.Application_ID
FROM Candidate C
RIGHT JOIN Application A ON C.Candidate_ID = A.Candidate_ID;

et
SELF JOIN:
Definition: Joins a table with itself to compare rows within the same table.
Relevance:

ar
• To find candidates with similar skill sets.
• To compare jobs with overlapping skills or salary ranges.
Example 1: Find Candidates with Matching Skills
hp
SELECT A.Name AS Candidate1, B.Name AS Candidate2
FROM Candidate A
INNER JOIN Candidate B ON A.Skills = B.Skills
WHERE A.Candidate_ID <> B.Candidate_ID;
es
CROSS JOIN:
Definition: Returns the Cartesian product of both tables.
Relevance:
• To generate all possible combinations of candidates and jobs (e.g., for AI Matching algo-
ur

rithms).
• To simulate scenarios where all candidates are matched with all jobs before applying
filters.
Example 1: Generate All Candidate-Job Combinations (For AI Matching)
r.S

SELECT C.Name, J.Title


FROM Candidate C
CROSS JOIN Job J;
D

86
NITK, Release DB COURSE PLAN-2024-25

JOIN Purpose in Recruitment System Example Use Case


Type
INNER Get matching records from both tables Candidates with job applications
JOIN
LEFT Get all records from the left table, even if Candidates who haven’t applied
JOIN no matches
RIGHT Get all records from the right table, even if Recruiters without job assign-
JOIN no matches ments

h
FULL Get all records from both tables, with All candidates and applications
JOIN NULLs where no match exists (with or without match)
SELF Compare records within the same table Candidates with similar skills

et
JOIN
CROSS Cartesian product of both tables All candidate-job combinations for
JOIN AI Matching

ar
• INNER JOIN and LEFT JOIN are the most commonly used in the recruitment system.
• CROSS JOIN is valuable when designing AI Matching models.
• SELF JOIN helps identify internal patterns (e.g., similar candidates or jobs).
hp
es
ur
r.S
D

87
20

h
AI Matching Recruitment System with Conventional

et
DBMS Models

ar
The AI Matching Recruitment System can be represented using various conventional
DBMS models. Each model describes how data is organized, related, and managed in the
recruitment workflow. hp
• Hierarchical Model
• Network Model
• Relational Model (RDBMS)
• Object-Oriented Model (OODBMS)
es
• Entity-Relationship Model (ER Model)
• Document-Oriented Model (NoSQL)
• Key-Value Model (NoSQL)
ur

• Graph Model (NoSQL)


Hierarchical Model:
Structure:
Data is organized in a tree-like hierarchy, where each child has a single parent.
r.S

Representation:

Company
└── Recruiters
└── Jobs
└── Applications
D

└── Candidates
└── AI_Matching
└── Interviews

Key Points:
• Each Recruiter manages multiple Jobs.
• Applications are linked to Candidates.
• AI_Matching connects candidates to jobs.

88
NITK, Release DB COURSE PLAN-2024-25

• Interviews are linked under applications.


Advantages:
• Fast data retrieval for hierarchical structures.
• Simple for parent-child relationships.
Disadvantages:
• Rigid structure; difficult for many-to-many relationships.
Network Model:

h
Structure:
Organizes data as a graph allowing many-to-many relationships.

et
Representation:

(Candidate) <--- Applies To ---> (Job)


| |

ar
Matches Assigned
↓ ↓
(AI_Matching) <--- Managed By ---> (Recruiter)

Interviewed In hp

(Interview)

Advantages:
• Handles complex M:N relationships efficiently.
es
• Flexible data connections.
Disadvantages:
• Complex navigation paths.
• Hard to manage schema changes.
ur

Relational Model (RDBMS):


Structure:
Data is organized in tables with relationships managed through primary and foreign keys.
r.S

Tables and Relationships:


• Candidate (Candidate_ID, Name, Email, Skills, Experience)
• Job (Job_ID, Title, Skills, Location, Salary)
• Application (Application_ID, Candidate_ID, Job_ID, Status)
D

• Recruiter (Recruiter_ID, Name, Email, Phone)


• Job_Assignment (Assignment_ID, Job_ID, Recruiter_ID, Date)
• AI_Matching (Match_ID, Candidate_ID, Job_ID, Match_Score)
• Interview (Interview_ID, Application_ID, Date, Feedback, Outcome)
Relationships:
• Candidate ↔ Application (1:N)
• Job ↔ Application (1:N)

89
NITK, Release DB COURSE PLAN-2024-25

• Job ↔ Job_Assignment ↔ Recruiter (M:N)


• Candidate ↔ AI_Matching ↔ Job (M:N)
• Application ↔ Interview (1:N)
Advantages:
• Strong data integrity with ACID compliance.
• Powerful SQL queries.
Disadvantages:

h
• Performance issues with large-scale complex joins.
Object-Oriented Model (OODBMS):

et
Structure:
Represents data as objects, similar to OOP languages.
Example:

ar
class Candidate:
def __init__(self, candidate_id, name, skills, experience):
self.candidate_id = candidate_id
self.name = name hp
self.skills = skills
self.experience = experience

class Job:
def __init__(self, job_id, title, skills, salary):
self.job_id = job_id
es
self.title = title
self.skills = skills
self.salary = salary

class Application:
def __init__(self, app_id, candidate, job, status):
ur

self.app_id = app_id
self.candidate = candidate
self.job = job
self.status = status
r.S

Advantages:
• Handles complex data structures naturally.
• Seamless integration with OOP languages.
Disadvantages:
D

• Less efficient for simple relational data.


• Complex query processing compared to RDBMS.
Entity-Relationship Model (ER Model)
Structure:
Data is represented as entities and relationships.
ER Diagram Representation:

90
NITK, Release DB COURSE PLAN-2024-25

[Candidate] --- Applies For ---> [Job]


| |
Matches Assigned
↓ ↓
[AI_Matching] <--- Managed By ---> [Recruiter]

Interviewed In

[Interview]

h
Advantages:
• Excellent for conceptual database design.

et
• Clear visualization of data relationships.
Disadvantages:
• Not directly implemented; requires conversion to RDBMS.

ar
Document-Oriented Model (NoSQL):
Structure:
Stores data as documents (JSON, BSON) for flexibility.
Example (MongoDB):

{
hp
"Candidate_ID": "C1",
"Name": "Alice",
"Skills": ["Python", "Machine Learning"],
"Applications": [
es
{
"Job_ID": "J1",
"Status": "Applied",
"AI_Matching": {
"Match_Score": 85
ur

},
"Interviews": [
{"Date": "2024-05-01", "Outcome": "Pass"}
]
}
]
r.S

Advantages:
• High scalability and flexibility.
• Schema-less design supports dynamic data.
D

Disadvantages:
• Complex querying compared to SQL.
• Limited transactional support.
Key-Value Model (NoSQL):
Structure:
Stores data as key-value pairs, suitable for fast lookups.

91
NITK, Release DB COURSE PLAN-2024-25

Example (Redis):

Candidate:C1 → { "Name": "Alice", "Skills": "Python, ML" }


Job:J1 → { "Title": "Data Scientist", "Skills": "Python, SQL" }
Match:C1:J1 → { "Match_Score": 85 }

Advantages:
• Extremely fast for simple key-based queries.
• Scalable for real-time applications.

h
Disadvantages:
• Limited support for complex queries.

et
• No relational capabilities.
Graph Model (NoSQL):
Structure:

ar
Represents data as nodes (entities) and edges (relationships).
Example (Neo4j):

(Alice)-[:APPLIED_FOR]->(Job:Data_Scientist)
hp
(Alice)-[:MATCHED_WITH {score: 85}]->(Job:Data_Scientist)
(Recruiter:John)-[:MANAGES]->(Job:Data_Scientist)

Advantages: - Efficient for complex relationship queries. - Fast graph traversals (e.g., rec-
ommendations).
Disadvantages: - Requires specialized query languages (e.g., Cypher). - Overhead for simple
es
data models.
Comparison of DBMS Models:

Model Best Use Case Advantages Disadvantages


ur

Hierarchical Model Simple parent-child Fast retrieval for hi- Rigid structure
workflows erarchical data
Network Model Complex many-to- Efficient M:N han- Complex navigation
many relationships dling
r.S

Relational Model Structured data, SQL Strong data in- Performance issues
(RDBMS) queries tegrity (ACID) with large joins
Object-Oriented Complex data, OOP Natural fit with Complex queries
Model integration OOP languages
Entity-Relationship Conceptual database Clear visualization Requires conversion
Model design to RDBMS
D

Document-Oriented Semi-structured data, High scalability Complex querying


Model (NoSQL) flexible schema
Key-Value Model Real-time caching Blazing fast for key- Limited relational
(NoSQL) and lookups based queries support
Graph Model Social networks, rec- Fast relationship Specialized query
(NoSQL) ommendations queries language required

• Relational Models (RDBMS) are ideal for structured data and complex queries.
• Graph Models excel in relationship-heavy applications (e.g., AI Matching).

92
NITK, Release DB COURSE PLAN-2024-25

• NoSQL Models like Document-Oriented and Key-Value are great for dynamic, real-
time data.
• The choice of the model depends on the specific needs of the recruitment system.

h
et
ar
hp
es
ur
r.S
D

93
21

h
Server Hierarchy

et
ar
Server
└── Database
├── Schema
│ ├── Tables
│ │ ├── Columns




├── Rows
└── Keys
hp
│ │ ├── Primary
│ │ └── Foreign
│ ├── Views
es
│ ├── Indexes
│ ├── Stored Procedures
│ └── Triggers
ur
r.S
D

94
22

h
Functional Dependency in RDBMS

et
ar
What is a Functional Dependency?
A Functional Dependency (FD) is a relationship between two sets of attributes in a relation
(table) of a relational database.
Formally, if we say: hp
X → Y

This means: Attribute(s) X functionally determine attribute(s) Y.


In simple terms:
es
If two rows (tuples) have the same value for X, they must have the same value for
Y.
ur

22.1 Example

Consider a table Students:


r.S

StudentID Name Department Email


101 Alice CS [email protected]
102 Bob EE [email protected]
103 Charlie CS [email protected]
D

From this data, we observe:


• StudentID → Name
• StudentID → Department
• StudentID → Email
If two rows have the same StudentID, they must have the same Name, Department, and Email.
Types of Functional Dependencies
There are various types of Functional Dependencies:

95
NITK, Release DB COURSE PLAN-2024-25

1. Trivial Functional Dependency


If Y is a subset of X, then:

X → Y

is trivial.
Example:

StudentID, Name → Name

h
2. Non-Trivial Functional Dependency
If Y is not a subset of X.

et
Example:

StudentID → Name

ar
3. Full Functional Dependency
Y is fully dependent on X and not on any subset of X.
Example:

StudentID → Email

4. Partial Dependency
hp
Y depends on part of a composite key X.
Example: If (CourseID, StudentID) is the primary key, and:
es
StudentID → StudentName

Then it’s a partial dependency.


5. Transitive Dependency
ur

If:

A → B and B → C
r.S

Then:

A → C

Example:

StudentID → DepartmentID and DepartmentID → DepartmentName


D

⇒ StudentID → DepartmentName

Why Are Functional Dependencies Important?


Functional dependencies play a crucial role in:
1. Normalization
• Help in identifying redundant data
• Guide schema decomposition
• Eliminate anomalies (insertion, update, deletion)

22.1. Example 96
NITK, Release DB COURSE PLAN-2024-25

2. Determining Keys
• Help define candidate keys, primary keys, and super keys
3. Data Integrity
• Ensure consistency and correctness of data
Armstrong’s Axioms
Used to infer all possible functional dependencies.

h
Rule Description
Reflexivity If Y ⊆ X, then X → Y

et
Augmentation If X → Y, then XZ → YZ
Transitivity If X → Y and Y → Z, then X → Z

Extended rules:

ar
• Union: If X → Y and X → Z, then X → YZ
• Decomposition: If X → YZ, then X → Y and X → Z
• Pseudotransitivity: If X → Y and YZ → W, then XZ → W
hp
22.2 Practical Use Case

Imagine a university database:


es
If:

StudentID → Email
StudentID → Name
ur

Then:
• Avoid storing Email or Name redundantly in other tables.
• Join on StudentID when necessary.
r.S

This improves consistency and reduces redundancy.


• A Functional Dependency (FD) expresses how one attribute set determines another.
• It is foundational for database normalization and schema design.
• Proper use of FD ensures efficient, consistent, and non-redundant data storage.
D

22.2. Practical Use Case 97


23

h
Functional Dependencies and Normal Forms

et
ar
Functional dependencies play a key role in identifying redundancy and guiding table decom-
position during normalization in relational database design. Here’s how they relate to normal
forms: 1NF, 2NF, 3NF, and BCNF.
Rule: hp
• Every attribute must contain atomic (indivisible) values.
• No repeating groups or arrays.
Functional Dependency Role:
• Functional dependencies are not deeply involved yet, but 1NF is the necessary starting
es
point.
Example:

-- Violates 1NF:
CREATE TABLE Students (
ur

StudentID INT,
Name VARCHAR(100),
Courses VARCHAR(255) -- e.g., "Math, Physics"
);
r.S

Fix:

CREATE TABLE StudentCourses (


StudentID INT,
Name VARCHAR(100),
Course VARCHAR(100)
);
D

Rule:
• Must be in 1NF.
• No partial dependencies: every non-prime attribute must be fully functionally depen-
dent on the entire primary key.
Functional Dependency Role:
• Detect and eliminate partial functional dependencies.

98
NITK, Release DB COURSE PLAN-2024-25

Example:

-- Primary Key: (StudentID, CourseID)


CREATE TABLE Enrollments (
StudentID INT,
CourseID INT,
StudentName VARCHAR(100), -- Partial Dependency
CourseName VARCHAR(100) -- Partial Dependency
);

h
Functional dependencies:
• StudentID → StudentName

et
• CourseID → CourseName
Violates 2NF because StudentName depends only on StudentID.
Fix:

ar
CREATE TABLE Students (
StudentID INT PRIMARY KEY,
StudentName VARCHAR(100)
);

CREATE TABLE Courses (


CourseID INT PRIMARY KEY,
CourseName VARCHAR(100)
hp
);

CREATE TABLE Enrollments (


StudentID INT,
es
CourseID INT,
PRIMARY KEY (StudentID, CourseID)
);

Rule:
ur

• Must be in 2NF.
• No transitive dependencies: non-key attributes should not depend on other non-key
attributes.
Functional Dependency Role:
r.S

• Detect and eliminate transitive dependencies.


Example:

CREATE TABLE Employees (


EmpID INT PRIMARY KEY,
D

EmpName VARCHAR(100),
DeptID INT,
DeptName VARCHAR(100)
);

Functional dependencies:
• EmpID → DeptID
• DeptID → DeptName
• Therefore: EmpID → DeptName (transitive)

99
NITK, Release DB COURSE PLAN-2024-25

Fix:

CREATE TABLE Employees (


EmpID INT PRIMARY KEY,
EmpName VARCHAR(100),
DeptID INT
);

CREATE TABLE Departments (


DeptID INT PRIMARY KEY,
DeptName VARCHAR(100)

h
);

et
Normal FD Issue Elimi- Requirement
Form nated
1NF Non-atomic values Atomic values only
2NF Partial dependency Full dependency on the entire primary key

ar
3NF Transitive depen- No non-key attributes depending on other non-key at-
dency tributes

hp
es
ur
r.S
D

100
24

h
Normalization in Relational Databases

et
ar
Normalization is the process of organizing data in a relational database to:
• Reduce redundancy
• Prevent update, insertion, and deletion anomalies
• Ensure data integrity
hp
It involves breaking down large tables into smaller, related tables and connecting them via
foreign keys.
Without normalization, databases often suffer from:
• Insertion anomalies – You cannot insert a value without supplying other unnecessary
es
data.
• Update anomalies – Updating data in one place but forgetting to update it elsewhere.
• Deletion anomalies – Deleting a record may remove important data unintentionally.
ur

Example:

-- Before Normalization:
| StudentID | Name | Course1 | Course2 |
|-----------|--------|---------|---------|
| 101 | Alice | Math | Physics |
r.S

Issues: - Courses are stored in multiple columns (repeating group). - Hard to scale, search,
or maintain.
After Normalization (1NF):

| StudentID | Name | Course |


D

|-----------|--------|---------|
| 101 | Alice | Math |
| 101 | Alice | Physics |

Each normal form (NF) builds upon the previous one to further reduce redundancy.
Goal: Eliminate repeating groups and ensure atomicity.
• All values in columns must be atomic (indivisible).
• No arrays, lists, or nested tables.

101
NITK, Release DB COURSE PLAN-2024-25

Bad:

| StudentID | Name | Courses |


|-----------|--------|-----------------|
| 101 | Alice | Math, Physics |

Good:

| StudentID | Name | Course |


|-----------|--------|---------|
| 101 | Alice | Math |

h
| 101 | Alice | Physics |

Goal: Eliminate partial dependencies.

et
• Must be in 1NF.
• Every non-prime attribute must be fully functionally dependent on the whole primary
key.

ar
Bad:

-- Primary key: (StudentID, CourseID)


| StudentID | CourseID | StudentName |
hp
StudentName depends only on StudentID, not the full key.
Fix:
• Move StudentName to a separate Students table.
Goal: Eliminate transitive dependencies.
es
• Must be in 2NF.
• No non-prime attribute should depend on another non-prime attribute.
Bad:
ur

| EmpID | DeptID | DeptName |

Here:
• EmpID → DeptID
r.S

• DeptID → DeptName
• Therefore: EmpID → DeptName (transitive)
Fix:

CREATE TABLE Employees (


D

EmpID INT PRIMARY KEY,


EmpName VARCHAR(100),
DeptID INT
);

CREATE TABLE Departments (


DeptID INT PRIMARY KEY,
DeptName VARCHAR(100)
);

Functional dependency:

102
NITK, Release DB COURSE PLAN-2024-25

• InstructorName → CourseID violates BCNF since InstructorName is not a superkey.


Fix:

CREATE TABLE Instructors (


InstructorName VARCHAR(100) PRIMARY KEY,
CourseID INT
);

CREATE TABLE CourseRooms (


CourseID INT,

h
RoomNumber INT,
PRIMARY KEY (CourseID, RoomNumber)
);

et
• Minimizes redundancy
• Improves data consistency
• Simplifies maintenance

ar
• Enhances data integrity
• Saves storage space
• Denormalization may be preferred for performance (e.g., analytics, reporting).
hp
• Useful when minimizing joins and optimizing read-heavy queries.

Normal FD Issue Elimi- Requirement


Form nated
es
1NF Non-atomic values Atomic values only
2NF Partial dependency Full dependency on the entire primary key
3NF Transitive depen- No non-key attributes depending on other non-key at-
dency tributes
ur
r.S
D

103
25

h
SQL-Based Normalization Walkthrough

et
ar
Scenario: Student Enrollment System
We begin with a denormalized table containing student and course data.
Unnormalized Table hp
CREATE TABLE StudentEnrollment (
StudentID INT,
StudentName VARCHAR(100),
Course1 VARCHAR(100),
Course2 VARCHAR(100),
Instructor1 VARCHAR(100),
es
Instructor2 VARCHAR(100)
);

Sample Data:
ur

| StudentID | StudentName | Course1 | Course2 | Instructor1 | Instructor2 |


|-----------|-------------|---------|---------|-------------|-------------|
| 101 | Alice | Math | Physics | Dr. Smith | Dr. Brown |

Step 1: First Normal Form (1NF)


r.S

Goal: Eliminate repeating groups and ensure atomicity.


New Table:

CREATE TABLE StudentCourses (


StudentID INT,
StudentName VARCHAR(100),
D

Course VARCHAR(100),
Instructor VARCHAR(100)
);

Transformed Data:

| StudentID | StudentName | Course | Instructor |


|-----------|-------------|---------|------------|
| 101 | Alice | Math | Dr. Smith |
| 101 | Alice | Physics | Dr. Brown |

104
NITK, Release DB COURSE PLAN-2024-25

Step 2: Second Normal Form (2NF)


Goal: Eliminate partial dependencies.
• Composite key: (StudentID, Course)
• StudentName depends only on StudentID
Refactored Tables:

CREATE TABLE Students (


StudentID INT PRIMARY KEY,

h
StudentName VARCHAR(100)
);

et
CREATE TABLE StudentCourses (
StudentID INT,
Course VARCHAR(100),
Instructor VARCHAR(100),
PRIMARY KEY (StudentID, Course),

ar
FOREIGN KEY (StudentID) REFERENCES Students(StudentID)
);

Step 3: Third Normal Form (3NF)


Goal: Eliminate transitive dependencies.hp
• Instructor depends on Course, not on (StudentID, Course)
Refactored Tables:

CREATE TABLE Courses (


Course VARCHAR(100) PRIMARY KEY,
es
Instructor VARCHAR(100)
);

CREATE TABLE StudentCourses (


StudentID INT,
ur

Course VARCHAR(100),
PRIMARY KEY (StudentID, Course),
FOREIGN KEY (StudentID) REFERENCES Students(StudentID),
FOREIGN KEY (Course) REFERENCES Courses(Course)
);
r.S

Final Normalized Schema

-- Students table
CREATE TABLE Students (
StudentID INT PRIMARY KEY,
StudentName VARCHAR(100)
D

);

-- Courses table
CREATE TABLE Courses (
Course VARCHAR(100) PRIMARY KEY,
Instructor VARCHAR(100)
);

-- Enrollment table
CREATE TABLE StudentCourses (
(continues on next page)

105
NITK, Release DB COURSE PLAN-2024-25

(continued from previous page)


StudentID INT,
Course VARCHAR(100),
PRIMARY KEY (StudentID, Course),
FOREIGN KEY (StudentID) REFERENCES Students(StudentID),
FOREIGN KEY (Course) REFERENCES Courses(Course)
);

-- Instructor info (optional BCNF step)


CREATE TABLE Instructors (

h
Instructor VARCHAR(100) PRIMARY KEY,
Course VARCHAR(100)
);

et
-- Course-Location assignment (optional BCNF step)
CREATE TABLE CourseAssignments (
Course VARCHAR(100),
Room VARCHAR(50),

ar
PRIMARY KEY (Course, Room),
FOREIGN KEY (Course) REFERENCES Courses(Course)
);

hp
es
ur
r.S
D

106
26

h
SQL Inserts, Joins, and Updates Without Anomalies

et
ar
Insert Sample Data
Students

INSERT INTO Students (StudentID, StudentName) VALUES


(101, 'Alice'),
(102, 'Bob'),
(103, 'Charlie');
hp
Courses

INSERT INTO Courses (Course, Instructor) VALUES


es
('Math', 'Dr. Smith'),
('Physics', 'Dr. Brown'),
('Chemistry', 'Dr. Green');

StudentCourses
ur

INSERT INTO StudentCourses (StudentID, Course) VALUES


(101, 'Math'),
(101, 'Physics'),
(102, 'Chemistry'),
(103, 'Math');
r.S

Join Across Normalized Tables


Get student names, enrolled courses, and their instructors:

SELECT
s.StudentID,
D

s.StudentName,
sc.Course,
c.Instructor
FROM
Students s
JOIN StudentCourses sc ON s.StudentID = sc.StudentID
JOIN Courses c ON sc.Course = c.Course;

Sample Output:

107
NITK, Release DB COURSE PLAN-2024-25

| StudentID | StudentName | Course | Instructor |


|-----------|-------------|-----------|-------------|
| 101 | Alice | Math | Dr. Smith |
| 101 | Alice | Physics | Dr. Brown |
| 102 | Bob | Chemistry | Dr. Green |
| 103 | Charlie | Math | Dr. Smith |

Update Without Anomalies


Problem in unnormalized tables:

h
To update instructor info, we’d need to find and update every record manually.
Now:

et
Update once in the Courses table.

UPDATE Courses
SET Instructor = 'Dr. Albert Smith'

ar
WHERE Course = 'Math';

Re-run the same join:

SELECT
s.StudentID, hp
s.StudentName,
sc.Course,
c.Instructor
FROM
Students s
JOIN StudentCourses sc ON s.StudentID = sc.StudentID
es
JOIN Courses c ON sc.Course = c.Course;

Updated Output:

| StudentID | StudentName | Course | Instructor |


ur

|-----------|-------------|-----------|------------------|
| 101 | Alice | Math | Dr. Albert Smith |
| 101 | Alice | Physics | Dr. Brown |
| 102 | Bob | Chemistry | Dr. Green |
| 103 | Charlie | Math | Dr. Albert Smith |
r.S
D

108
27

h
SQL DELETE and Cascading Relationships

et
ar
Cascading Deletes with ON DELETE CASCADE
In a normalized database, if we delete a record (e.g., a student or a course), we often want
related data to be deleted automatically to avoid orphaned records.
hp
For example, if a student is removed, all their enrollments should be removed too. Similarly,
deleting a course should remove any course assignments.
Step 1: Define Foreign Key with ON DELETE CASCADE
Here’s how to define cascading deletes when setting up the foreign key relationship:

CREATE TABLE Students (


es
StudentID INT PRIMARY KEY,
StudentName VARCHAR(100)
);

CREATE TABLE Courses (


ur

Course VARCHAR(100) PRIMARY KEY,


Instructor VARCHAR(100)
);

CREATE TABLE StudentCourses (


StudentID INT,
r.S

Course VARCHAR(100),
PRIMARY KEY (StudentID, Course),
FOREIGN KEY (StudentID) REFERENCES Students(StudentID) ON DELETE CASCADE,
FOREIGN KEY (Course) REFERENCES Courses(Course) ON DELETE CASCADE
);
D

CREATE TABLE CourseAssignments (


Course VARCHAR(100),
Room VARCHAR(50),
PRIMARY KEY (Course, Room),
FOREIGN KEY (Course) REFERENCES Courses(Course) ON DELETE CASCADE
);

Step 2: Delete a Student and Automatically Remove Their Enrollments


Let’s delete a student, and observe how the cascading deletes automatically remove their
enrollments.

109
NITK, Release DB COURSE PLAN-2024-25

DELETE FROM Students


WHERE StudentID = 101;

Result:
The student with ID 101 (Alice) and all records associated with her in the StudentCourses
table will be automatically deleted.
Step 3: Delete a Course and Automatically Remove All Enrollments and Assignments
Similarly, let’s delete a course and have its assignments and student enrollments removed

h
automatically.

DELETE FROM Courses

et
WHERE Course = 'Math';

Result:
All records related to the course “Math” in both the StudentCourses and CourseAssignments

ar
tables will be deleted.
Additional Notes on CASCADE
When you define ON DELETE CASCADE, be cautious as it can lead to unintentional data loss
if the delete operation is not carefully controlled. Always ensure that cascading deletes are
hp
applied only to tables where it makes sense (e.g., child records related to a parent record).
You can also apply ON UPDATE CASCADE to propagate changes in primary key values across
related tables, but it’s less common than cascading deletes.
Testing Cascading Deletes
es
You can simulate cascading deletes by running the following tests:
1. Insert new data:

INSERT INTO Students (StudentID, StudentName) VALUES (104, 'David');


INSERT INTO Courses (Course, Instructor) VALUES ('Biology', 'Dr. Black');
ur

INSERT INTO StudentCourses (StudentID, Course) VALUES (104, 'Biology');


INSERT INTO CourseAssignments (Course, Room) VALUES ('Biology', 'Room 303');

2. Delete student 104 and check:

DELETE FROM Students WHERE StudentID = 104;


r.S

3. Delete course Biology and check:

DELETE FROM Courses WHERE Course = 'Biology';

4. Check the tables to confirm:


D

SELECT * FROM Students;


SELECT * FROM StudentCourses;
SELECT * FROM CourseAssignments;

Cascading deletes ensure referential integrity by automatically cleaning up dependent


records when a parent record is deleted. Use them carefully to prevent accidental loss of
important data.

110
28

h
Lossless and Lossy Joins in RDBMS

et
ar
Definition of Join
A join is an operation to combine two or more relations based on a common attribute.
Lossless Join
hp
A lossless join ensures that decomposing and then rejoining relations does not result
in any loss or addition of data.
• Maintains original data integrity
• No spurious tuples created
es
• At least one common attribute must be a key
Example:

-- Original
Student(id, name, dept)
ur

-- Decomposed
Student1(id, name)
Student2(id, dept)

-- Rejoined
r.S

SELECT * FROM Student1


JOIN Student2 USING(id);

If id is a primary key, the join is lossless.


Lossy Join
D

A lossy join occurs when rejoining decomposed tables introduces incorrect or extra rows.
• Does not preserve original relation exactly
• May occur when common attribute is not a key
Example:

-- Original
R(A, B, C)

(continues on next page)

111
NITK, Release DB COURSE PLAN-2024-25

(continued from previous page)


-- Decomposed
R1(A, B)
R2(B, C)

-- Rejoin
SELECT * FROM R1
JOIN R2 USING(B);

If B is not a key, the join is lossy.

h
Lossless Join Test
A decomposition of R into R1 and R2 is lossless if:

et
(R1 ∩ R2) → R1 or (R1 ∩ R2) → R2

i.e., common attributes functionally determine one of the relations.

ar
Table 1: Comparison Table
Join Type Description
Lossless No spurious tuples, maintains original data
hp
Lossy Extra tuples added, incorrect reconstruction
es
ur
r.S
D

112
29

h
Checking for Lossless vs Lossy Join in RDBMS

et
ar
Lossless Join Detection Algorithm:
This algorithm checks whether a decomposition of a relation is lossless or lossy using a
tabular method and functional dependencies.
Inputs: hp
• Relation R(A1, A2, …, An)
• Decomposition: R1, R2, …, Rk
• Functional dependencies (FDs)
es
Algorithm Steps
1. Create a table:
• Rows = decomposed relations (R1, R2, …, Rk)
• Columns = attributes of R
ur

2. Initialize values:
• If attribute Ai is in Ri → mark as aᵢ
• Else → mark as unique symbol bᵢ
r.S

3. Apply functional dependencies:


• For each FD X → Y: - If values for X are the same in some rows, unify Y in those rows
4. Repeat Step 3 until no further changes.
5. Check result:
D

• If any row contains only aᵢ values → the decomposition is lossless


• Else → lossy
Example

Relation: R(A, B, C)
Decomposition: R1(A, B), R2(B, C)
FDs: A → B, C → A

Initial Table:
(continues on next page)

113
NITK, Release DB COURSE PLAN-2024-25

(continued from previous page)

| Row | A | B | C |
|-----|-----|-----|-----|
| R1 | a1 | a2 | b3 |
| R2 | b1 | a2 | a3 |

Apply FDs:

- C → A → changes b1 to a1 in R2

h
- No row is fully 'aᵢ' → Lossy

Remarks:

et
A decomposition is lossless if and only if we can propagate values such that at least one row
contains only original aᵢ values.
When decomposing a relation into two or more sub-relations, it’s important to ensure that
the original relation can be perfectly reconstructed using a natural join. This is known as a

ar
lossless join. If reconstruction introduces incorrect or duplicate tuples, the decomposition
is lossy.
Lossless Join Condition

(R1 ∩ R2) → R1 or
hp
For a decomposition of a relation R into R1 and R2, the join is lossless if:

(R1 ∩ R2) → R2

This means the common attribute(s) must functionally determine at least one of the rela-
tions.
es
Illustrated Example
ur

Explanation of the Diagram


• Original Relation:
r.S

R(A, B, C)
• Decomposed Relations:
– R1(A, B)
– R2(B, C)
D

• Join operation: R1 R2
Outcome:
• The join is lossless if B → A or B → C holds (i.e., B is a key in either relation).
• The join is lossy if B does not functionally determine the other attributes.
Use Case in AI-Driven Recruitment
Suppose you have a relation:

114
NITK, Release DB COURSE PLAN-2024-25

Candidate(candidate_id, name, job_id, job_title)

And decompose into:


• C1(candidate_id, name)
• C2(job_id, job_title)
To avoid a lossy join, make sure:
• candidate_id is a key in C1

h
• job_id is a key in C2
Otherwise, when joining on these attributes, you may produce spurious combinations.

et
Table 1: Types of join
Join Type Description

ar
Lossless Join Reconstructs original data without spurious tuples
Lossy Join Results in incorrect or duplicate tuples
Lossless Criteria Common attributes must functionally determine one relation

hp
es
ur
r.S
D

115
30

h
Stored Procedures in Relational Databases

et
ar
A stored procedure is a precompiled collection of one or more SQL statements that can be
executed as a single unit. These procedures are stored within the database itself and can be
invoked from an application or another SQL command. Stored procedures are typically used
to encapsulate logic that can be reused and to enhance performance by reducing the amount
of code sent over the network. hp
Benefits of Stored Procedures
1. Performance: Stored procedures are precompiled, which means that the database en-
gine optimizes them ahead of time, reducing the processing time for execution. They
reduce the network overhead since only the procedure call is sent over the network
rather than sending multiple SQL queries.
es
2. Security: Stored procedures allow you to encapsulate complex logic and access control.
Permissions can be granted to users on the procedure rather than on the underlying
tables, providing an extra layer of security.
3. Maintainability: Once a stored procedure is created, it can be used multiple times in
ur

different applications. This reduces redundancy and makes maintenance easier, as you
only need to update the logic in one place.
4. Transaction Control: Stored procedures can contain logic to handle transactions (e.g.,
using BEGIN TRANSACTION, COMMIT, and ROLLBACK). This ensures that all opera-
r.S

tions within the procedure are completed successfully, or no changes are made if an
error occurs.
5. Reduced Complexity: Complex operations that require multiple steps can be encapsu-
lated in a stored procedure, allowing applications to call a single procedure rather than
multiple SQL statements.
D

Structure of a Stored Procedure


Stored procedures generally have the following structure:
1. Procedure Name: The name of the stored procedure.
2. Parameters: Input parameters (optional), which are used to pass values to the proce-
dure.
3. Logic: A sequence of SQL statements and control flow constructs (such as loops, condi-
tions, etc.).
4. Return Type: Some stored procedures return values or result sets.

116
NITK, Release DB COURSE PLAN-2024-25

Basic Syntax (for MySQL, PostgreSQL, SQL Server)


MySQL (Example)

DELIMITER $$

CREATE PROCEDURE GetStudentInfo(IN student_id INT)


BEGIN
SELECT name, age, course
FROM students
WHERE id = student_id;

h
END $$

DELIMITER ;

et
In this example:
• DELIMITER $$: Changes the delimiter so that the semicolon (;) can be used inside the
procedure body.

ar
• IN student_id INT: Defines an input parameter called student_id.
• The body of the procedure contains a SELECT statement to retrieve student information.
PostgreSQL (Example)
hp
CREATE OR REPLACE FUNCTION GetStudentInfo(student_id INT)
RETURNS TABLE(name VARCHAR, age INT, course VARCHAR) AS $$
BEGIN
RETURN QUERY
SELECT name, age, course
FROM students
es
WHERE id = student_id;
END;
$$ LANGUAGE plpgsql;

In PostgreSQL, functions are used similarly to stored procedures. Here, RETURN QUERY is
ur

used to return the result of a query.


SQL Server (Example)

CREATE PROCEDURE GetStudentInfo


@student_id INT
r.S

AS
BEGIN
SELECT name, age, course
FROM students
WHERE id = @student_id;
END;
D

In SQL Server, the parameters are defined with a @ symbol, and BEGIN...END is used to define
the block of code.
Types of Stored Procedures:
1. Simple Stored Procedures: These procedures are used to perform simple operations
such as inserts, updates, and deletes.
2. Parameterized Stored Procedures: These allow for input parameters to be passed
to the stored procedure. These parameters can be used in the SQL logic inside the
procedure.

117
NITK, Release DB COURSE PLAN-2024-25

3. Returning Results: Some stored procedures return a result set (similar to a query) to
the calling application or user.
4. Transactional Stored Procedures: These procedures contain BEGIN TRANSACTION,
COMMIT, and ROLLBACK statements to manage transactions. They ensure atomicity and
integrity of the operations.
Example Use Cases:
1. Data Validation: A stored procedure can be used to validate incoming data before it is
inserted into a table. For example, checking if a customer’s age is above a certain value

h
or if an email address follows a valid format.
2. Complex Business Logic: A stored procedure can encapsulate complex business logic
that involves multiple operations, such as calculating discounts, applying taxes, and up-

et
dating inventory when an order is placed.
3. Batch Processing: You can use stored procedures to perform batch processing, such
as updating records in bulk or aggregating data from multiple tables.

ar
4. Error Handling: Procedures can be used to handle errors systematically. For instance,
you can log errors into a separate error table whenever something goes wrong inside
the procedure.
5. Security Auditing: Procedures can be written to track who accessed sensitive data
hp
or performed certain operations, which is especially useful in environments requiring
compliance with standards like HIPAA or GDPR.
Example of Complex Stored Procedure (with Error Handling)
DELIMITER $$
es
CREATE PROCEDURE ProcessOrder(IN order_id INT, IN customer_id INT)
BEGIN
DECLARE total_amount DECIMAL(10,2);
DECLARE product_count INT;
DECLARE error_message VARCHAR(255);
ur

-- Start Transaction
START TRANSACTION;

-- Check if the customer exists


IF NOT EXISTS (SELECT 1 FROM customers WHERE id = customer_id) THEN
r.S

SET error_message = 'Customer not found';


ROLLBACK;
SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = error_message;
END IF;

-- Retrieve total order value


SELECT SUM(price) INTO total_amount FROM order_items WHERE order_id = order_id;
D

-- If order amount exceeds limit, raise error


IF total_amount > 1000 THEN
SET error_message = 'Order exceeds allowed limit';
ROLLBACK;
SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = error_message;
END IF;

-- Update order status


UPDATE orders SET status = 'Processed' WHERE id = order_id;
(continues on next page)

118
NITK, Release DB COURSE PLAN-2024-25

(continued from previous page)

-- Commit Transaction
COMMIT;
END $$

DELIMITER ;

In this example:

h
• It starts a transaction using START TRANSACTION.
• It checks if the customer exists. If not, it rolls back the transaction and raises an error
using SIGNAL.

et
• It checks if the total order amount exceeds a threshold. If it does, it rolls back the
transaction and raises an error.
• If everything is successful, it updates the order status and commits the transaction.

ar
Error Handling in Stored Procedures
Different databases offer various methods for handling errors within stored procedures:
• MySQL: You can use SIGNAL SQLSTATE to raise an error.
hp
• PostgreSQL: You can use EXCEPTION blocks to catch and handle errors.
• SQL Server: You can use TRY...CATCH blocks to handle errors.
Best Practices for Stored Procedures
1. Keep Logic Simple: The more complex your stored procedure becomes, the harder it
es
is to maintain. Try to keep procedures simple and focused on a single task.
2. Use Transaction Management: Ensure that your stored procedure uses proper trans-
action handling (e.g., BEGIN TRANSACTION, COMMIT, ROLLBACK) to guarantee atomicity.
3. Parameterize Queries: Avoid concatenating user input directly into SQL queries to
ur

prevent SQL injection attacks. Always use parameters.


4. Document Procedures: Write comments inside your stored procedures explaining the
purpose and logic to help future developers understand the code.
5. Avoid Nested Loops: If possible, avoid complex nested loops or queries inside your
stored procedures, as they can lead to poor performance.
r.S

6. Error Logging: Make use of error logging within the stored procedure for easier de-
bugging and to maintain records of any issues.
Stored procedures are a powerful feature in relational databases, offering benefits such as
performance optimization, security, and maintainability. They allow you to encapsulate com-
plex logic within the database, thus reducing the need for repetitive code in applications.
D

However, they should be used wisely, as they can complicate your database schema and
maintenance if not implemented properly.

119
31

h
MySQL Stored Procedure Examples

et
ar
This section provides examples of common stored procedures in MySQL using different pa-
rameter types and control structures.
1. Basic Stored Procedure (No Parameters)
sql hp
DELIMITER //

CREATE PROCEDURE GetAllStudents()


BEGIN
SELECT * FROM students;
es
END //

DELIMITER ;

Call:
ur

CALL GetAllStudents();

2. Stored Procedure with IN Parameters


sql
r.S

DELIMITER //

CREATE PROCEDURE GetStudentById(IN stud_id INT)


BEGIN
SELECT * FROM students WHERE id = stud_id;
END //
D

DELIMITER ;

Call:

CALL GetStudentById(101);

3. Stored Procedure with OUT Parameters


sql

120
NITK, Release DB COURSE PLAN-2024-25

DELIMITER //

CREATE PROCEDURE GetStudentName(IN stud_id INT, OUT stud_name VARCHAR(100))


BEGIN
SELECT name INTO stud_name FROM students WHERE id = stud_id;
END //

DELIMITER ;

Call:

h
CALL GetStudentName(101, @name);
SELECT @name;

et
4. Stored Procedure with INOUT Parameters
sql

ar
DELIMITER //

CREATE PROCEDURE UpdateAndReturnMarks(INOUT stud_marks INT)


BEGIN
SET stud_marks = stud_marks + 5;
END //

DELIMITER ;
hp
Call:

SET @marks = 90;


es
CALL UpdateAndReturnMarks(@marks);
SELECT @marks; -- Returns 95

5. Stored Procedure with IF/ELSE


ur

sql

DELIMITER //

CREATE PROCEDURE CheckGrade(IN marks INT, OUT grade CHAR(1))


BEGIN
r.S

IF marks >= 90 THEN


SET grade = 'A';
ELSEIF marks >= 75 THEN
SET grade = 'B';
ELSE
SET grade = 'C';
END IF;
D

END //

DELIMITER ;

Call:

CALL CheckGrade(82, @grade);


SELECT @grade;

6. Stored Procedure with WHILE Loop

121
NITK, Release DB COURSE PLAN-2024-25

sql

DELIMITER //

CREATE PROCEDURE PrintNumbers()


BEGIN
DECLARE i INT DEFAULT 1;

WHILE i <= 5 DO
SELECT i;
SET i = i + 1;

h
END WHILE;
END //

et
DELIMITER ;

Call:

CALL PrintNumbers();

ar
7. Stored Procedure with CURSOR
sql

DELIMITER //

CREATE PROCEDURE ListStudentNames()


hp
BEGIN
DECLARE done INT DEFAULT 0;
DECLARE stud_name VARCHAR(100);
DECLARE cur CURSOR FOR SELECT name FROM students;
es
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

OPEN cur;

read_loop: LOOP
ur

FETCH cur INTO stud_name;


IF done THEN
LEAVE read_loop;
END IF;
SELECT stud_name;
END LOOP;
r.S

CLOSE cur;
END //

DELIMITER ;
D

Call:

CALL ListStudentNames();

122
32

h
MySQL Triggers

et
ar
Triggers are stored programs in MySQL that are automatically invoked in response to certain
events on a table.
What is a Trigger?
A trigger in MySQL is a stored database object that automatically executes (or fires) in re-
hp
sponse to a specific event such as:
• INSERT
• UPDATE
• DELETE
es
It’s like telling MySQL:
“Whenever this action happens on this table, automatically run this block of code.”
Why Use Triggers?
ur

Triggers help in:


• Enforcing business rules
• Auditing data changes
• Maintaining referential integrity
r.S

• Performing automatic calculations


• Synchronizing tables
Trigger Events and Timing
Each trigger is associated with:
D

• An event: INSERT, UPDATE, or DELETE


• A timing: BEFORE or AFTER

123
NITK, Release DB COURSE PLAN-2024-25

Table 1: Possible combinations


Timing Event Example Use Case
BEFORE INSERT Validate or modify data before insert
AFTER INSERT Log new data into audit table
BEFORE UPDATE Enforce constraints
AFTER UPDATE Track history of changes
BEFORE DELETE Prevent deletion or archive data
AFTER DELETE Log deleted records

h
Syntax

et
sql

CREATE TRIGGER trigger_name


{BEFORE | AFTER} {INSERT | UPDATE | DELETE}
ON table_name

ar
FOR EACH ROW
BEGIN
-- SQL statements
END;

Notes
hp
• Use FOR EACH ROW to trigger once per affected row.
• Use NEW.column_name for INSERT/UPDATE.
• Use OLD.column_name for UPDATE/DELETE.
es
• BEFORE triggers can modify values.
• AFTER triggers are read-only.
Examples
ur

1. AFTER INSERT: Log inserted records

CREATE TABLE audit_log (


id INT AUTO_INCREMENT PRIMARY KEY,
student_id INT,
action_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP
r.S

);

DELIMITER //

CREATE TRIGGER after_student_insert


AFTER INSERT ON students
FOR EACH ROW
D

BEGIN
INSERT INTO audit_log(student_id) VALUES (NEW.id);
END //

DELIMITER ;

2. BEFORE UPDATE: Prevent negative marks

124
NITK, Release DB COURSE PLAN-2024-25

DELIMITER //

CREATE TRIGGER before_marks_update


BEFORE UPDATE ON students
FOR EACH ROW
BEGIN
IF NEW.marks < 0 THEN
SET NEW.marks = 0;
END IF;
END //

h
DELIMITER ;

et
3. BEFORE DELETE: Archive record

CREATE TABLE students_archive AS SELECT * FROM students WHERE 0;

DELIMITER //

ar
CREATE TRIGGER before_student_delete
BEFORE DELETE ON students
FOR EACH ROW
BEGIN hp
INSERT INTO students_archive SELECT * FROM students WHERE id = OLD.id;
END //

DELIMITER ;

Trigger Management
es
List triggers:

SHOW TRIGGERS;

Drop trigger:
ur

DROP TRIGGER IF EXISTS trigger_name;

Limitations
• Cannot trigger on TRUNCATE.
r.S

• Cannot commit/rollback inside a trigger.


• Cannot directly call a trigger.
• Cannot use triggers on views.
D

Table 2: Summary of Use-Cases


Use Case Trigger Type
Log changes AFTER INSERT/UPDATE
Audit user activity AFTER UPDATE/DELETE
Enforce data constraints BEFORE INSERT/UPDATE
Maintain backups BEFORE DELETE
Cascade updates BEFORE UPDATE

125
33

h
Stored Procedures and Triggers in AI-Driven

et
Recruitment

ar
Overview
In an AI-powered recruitment system, stored procedures and triggers help ensure that com-
plex workflows and automated monitoring actions are executed reliably and efficiently.
hp
Use Case
When a candidate applies for a job, the system needs to:
1. Save the candidate’s basic information
2. Upload their resume reference
es
3. Insert the AI evaluation score
4. Update the job posting with the increased applicant count
To simplify and encapsulate these steps, a stored procedure called submit_application() is
ur

used.
Additionally, to monitor AI evaluations, an AFTER INSERT trigger on the ai_scores table is
defined to automatically log entries into an audit table.
Diagram: Stored Procedures and Triggers Flow
r.S

Stored Procedure: submit_application()


D

DELIMITER //

CREATE PROCEDURE submit_application (


IN p_name VARCHAR(100),
IN p_email VARCHAR(100),
IN p_resume_path VARCHAR(255),
IN p_score INT,
IN p_job_id INT
)
BEGIN
(continues on next page)

126
NITK, Release DB COURSE PLAN-2024-25

(continued from previous page)


INSERT INTO candidates(name, email, job_id)
VALUES (p_name, p_email, p_job_id);

INSERT INTO resumes(candidate_id, file_path)


VALUES (LAST_INSERT_ID(), p_resume_path);

INSERT INTO ai_scores(candidate_id, score)


VALUES (LAST_INSERT_ID(), p_score);

h
UPDATE jobs
SET applicant_count = applicant_count + 1
WHERE id = p_job_id;

et
END //

DELIMITER ;

Trigger: Logging AI Scores

ar
DELIMITER //

CREATE TRIGGER log_ai_score


AFTER INSERT ON ai_scores
FOR EACH ROW
BEGIN
hp
INSERT INTO score_audit_log(candidate_id, score, logged_at)
VALUES (NEW.candidate_id, NEW.score, NOW());
END //

DELIMITER ;
es

Table 1: Benefits
Mechanism Purpose
ur

Stored Procedure Groups related operations into a single transaction


AFTER INSERT Trigger Automatically logs score insertions

This architecture:
r.S

• Reduces redundancy
• Ensures consistency
• Improves auditability
D

127
34

h
Aggregate Functions in SQL

et
ar
What are Aggregate Functions?
Aggregate functions perform a calculation on a set of values and return a single summary
value. They are used often with GROUP BY.

Function
hp
Table 1: Common Aggregate Functions
Description
COUNT() Count rows
SUM() Add values
AVG() Average of values
es
MIN() Minimum value
MAX() Maximum value
GROUP_CONCAT() Concatenate values (MySQL only)
ur

Examples
COUNT()

SELECT COUNT(*) FROM students;


SELECT COUNT(email) FROM students;
r.S

SUM()

SELECT department, SUM(marks) AS total_marks


FROM students
GROUP BY department;
D

AVG()

SELECT AVG(marks) FROM students;

MIN() and MAX()

SELECT MIN(marks), MAX(marks) FROM students;

GROUP_CONCAT()

128
NITK, Release DB COURSE PLAN-2024-25

SELECT GROUP_CONCAT(name) FROM students;

Using GROUP BY and HAVING

SELECT department, AVG(marks) AS avg_marks


FROM students
GROUP BY department
HAVING AVG(marks) > 80;

Note

h
• NULLs are ignored by most aggregate functions (except COUNT(*)).
• Use HAVING to filter groups (not WHERE).

et
ar
hp
es
ur
r.S
D

129
35

h
Aggregate Functions in AI-Driven Recruitment

et
ar
Aggregate functions in SQL are used in AI-driven recruitment platforms to analyze and sum-
marize large volumes of candidate data generated by automated evaluation systems.
In particular, the ai_scores table stores scores for each candidate based on different skill
evaluations. These scores are analyzed using standard SQL aggregate functions.
hp
Use Case: Analyzing AI Score Data

Table 1: ai_scores
candidate_id skill score
es
1001 Python 85
1001 SQL 90
1002 Python 78
ur

Common aggregate functions used:


• AVG(score): average score per candidate or skill
• MAX(score), MIN(score): range of performance
• COUNT(*): number of total evaluations
r.S

• SUM(score): cumulative score for ranking or tuning AI


Diagram: Aggregate Function Flow
D

130
NITK, Release DB COURSE PLAN-2024-25

h
et
ar
hp
Table 2: Function Description
Function Purpose
es
AVG(score) Used in HR dashboards to rank candidate averages
MAX(score) Identifies top skill scorers
MIN(score) Flags weak skill areas
COUNT(*) Tracks total number of evaluations
ur

SUM(score) Tuning metric for AI models

Consumers of Aggregated Data


• HR Managers: Use average and top scores to shortlist candidates.
r.S

• Admin Reports: Use COUNT(*) to monitor system usage and trends.


• AI Model Tuners: Use cumulative scores to adjust evaluation thresholds.
SQL Example

SELECT candidate_id,
D

AVG(score) AS avg_score,
MAX(score) AS max_score,
MIN(score) AS min_score
FROM ai_scores
GROUP BY candidate_id;

131
36

h
Indexes in RDBMS

et
ar
What is an Index?
An index is a data structure that improves the speed of data retrieval operations on a database
table at the cost of additional storage and maintenance.
Why Use Indexes? hp
• Speed up SELECT queries
• Optimize WHERE, JOIN, ORDER BY, and GROUP BY
• Enable fast lookups and range searches
How Indexes Work
es
• Most indexes use B-Trees
• Index stores a sorted copy of key columns
• Maintains a pointer to actual row in the table
ur

Syntax

CREATE INDEX idx_email ON students(email);


DROP INDEX idx_email ON students;
r.S

Table 1: Types of Indexes


Type Description
Single-column Index on one column
Composite Index on multiple columns
Unique Prevents duplicates
D

Full-text For searching large text


Spatial For GIS/spatial data
Clustered Sorts rows physically (SQL Server)
Non-clustered Separate index structure

Example

CREATE INDEX idx_email ON students(email);


SELECT * FROM students WHERE email = '[email protected]';

132
NITK, Release DB COURSE PLAN-2024-25

Pros and Cons


Pros:
• Speeds up data retrieval
• Optimizes joins and filters
Cons:
• Slows down inserts/updates
• Consumes storage

h
• Requires tuning
Viewing Indexes

et
SHOW INDEX FROM students; -- MySQL

SELECT * FROM pg_indexes WHERE tablename = 'students'; -- PostgreSQL

ar
hp
es
ur
r.S
D

133
37

h
B-Trees in RDBMS

et
ar
What is a B-Tree?
A B-Tree is a self-balancing search tree that maintains sorted data and allows searches, in-
sertions, and deletions in logarithmic time.
Properties of B-Trees hp
• Max m children per node
• At least ceil(m/2) children (except root)
• All leaves at the same level
• Keys in sorted order
es
Why Use in RDBMS?
• Reduces disk I/O
• Supports fast range queries
ur

• Maintains balance (logarithmic height)


Operations
Search:
r.S

• Start at root
• Binary search within node
• Traverse down recursively
Insert:
D

• Add key to correct position


• Split if node overflows
Delete:
• Remove key
• Borrow/merge to fix underflow

134
NITK, Release DB COURSE PLAN-2024-25

Table 1: B-Trees vs B+ Trees


Feature B-Tree B+ Tree
Data location Internal & leaf Leaf only
Leaf linked? No Yes
Used in RDBMS Sometimes Yes

MySQL Example

h
CREATE TABLE students (
id INT PRIMARY KEY,
name VARCHAR(100),

et
marks INT,
INDEX idx_marks (marks)
);

ar
Benefits
• Balanced structure
• Good for large datasets
• Efficient disk usage hp
Limitations
• Complex to implement
• Slower than hash for point lookups
B-Trees in RDBMS (Order 3):
es
What is a B-Tree of Order 3?
A B-Tree of order 3 is a balanced search tree where:
• Each node can have at most 2 keys and 3 child pointers.
ur

• Internal nodes (except the root) must have at least 2 children.


• All leaves appear at the same depth.
• Keys inside each node are stored in sorted order.
r.S

Why Use B-Trees for Indexing in RDBMS?


Relational databases such as MySQL (InnoDB), PostgreSQL, and Oracle use B-Trees (or
B+ Trees) for indexing because:
• B-Trees reduce disk I/O by minimizing the number of reads.
• They support log(n) lookup, insert, and delete operations.
D

• They handle large datasets efficiently.


• Data is stored in a balanced, sorted structure — perfect for indexes.
Structure of a B-Tree (Order 3)
• Each node stores 1 or 2 keys.
• Each node can have between 2 and 3 children.
• Keys are sorted within each node.

135
NITK, Release DB COURSE PLAN-2024-25

• All leaves are at the same level (balanced tree).


Example: Indexing a Table
Suppose we have a table:

CREATE TABLE students (


id INT PRIMARY KEY,
name VARCHAR(100)
);

h
Constructing a B-Tree of Order 3
Input:

et
Insert the following keys in order:

10, 20, 50, 40, 30, 70, 80, 60, 25, 55

Rules for B-Tree of Order 3:

ar
• Max 2 keys per node
• Max 3 children per node
• Min 1 key per internal node (except the root)
hp
Step-by-step Construction:
1. Insert 10

[10]
es
2. Insert 20

[10, 20]

3. Insert 50
ur

[10, 20, 50] → overflow

Middle key = 20 → promote

New structure:
r.S

[20]
/ \
[10] [50]

4. Insert 40
D

40 > 20 → go to [50] → insert → [40, 50]

[20]
/ \
[10] [40, 50]

5. Insert 30

136
NITK, Release DB COURSE PLAN-2024-25

30 > 20 → go to [40, 50] → insert → [30, 40, 50] → overflow

Promote middle = 40 → new root becomes [20, 40]

[20, 40]
/ | \
[10] [30] [50]

6. Insert 70

h
70 > 40 → go to [50] → insert → [50, 70]

et
[20, 40]
/ | \
[10] [30] [50, 70]

7. Insert 80

ar
[50, 70] → insert → [50, 70, 80] → overflow

Promote 70 → [20, 40, 70] → root overflows


hp
Promote middle = 40 → new root becomes [40]

[40]
/ \
[20] [70]
/ \ / \
es
[10] [30] [50] [80]

8. Insert 60

60 > 40, < 70 → go to [50] → insert → [50, 60]


ur

[40]
/ \
[20] [70]
/ \ / \
[10] [30] [50, 60] [80]
r.S

9. Insert 25

25 < 40 → go left to [20] → then to [30] → insert → [25, 30]

[40]
D

/ \
[20] [70]
/ \ / \
[10] [25, 30] [50, 60] [80]

10. Insert 55

55 → goes to [50, 60] → insert → [50, 55, 60] → overflow

Promote 55 → [70] becomes [55, 70]


(continues on next page)

137
NITK, Release DB COURSE PLAN-2024-25

(continued from previous page)

Children: [50], [60], [80]

Final B-Tree:

[40]
/ \
[20] [55, 70]
/ \ / | \

h
[10] [25,30] [50] [60] [80]

This creates a B-Tree index on the id column (automatically clustered in MySQL/InnoDB).

et
Table 2: Advantages of B-Tree Indexes in RDBMS (Order
3)
Feature Benefit

ar
Balanced tree Predictable, fast lookup time (log n)
High fan-out (3 children per node) Fewer levels → less disk I/O
Sorted keys Efficient for range queries (<, BETWEEN)

When Not to Use B-Trees


hp
• Columns that are frequently updated (index maintenance is costly)
• Columns with low selectivity (e.g., gender: ‘M’/’F’)
• Situations where hash indexes or full-text search is better
es
• B-Trees of order 3 are commonly used to build balanced index structures.
• RDBMSs use them for primary and secondary indexes.
• They provide efficient, scalable, and predictable performance.
ur
r.S
D

138
38

h
B+ Trees in RDBMS

et
ar
What is a B+ Tree?
A B+ Tree is a balanced tree data structure used to store indexes in databases. It is an
optimized version of the B-Tree where:
• All data is stored in leaf nodes. hp
• Internal nodes store only keys for navigation.
• Leaf nodes are linked together for fast range queries.

Table 1: Key Properties


es
Feature B+ Tree
Data storage In leaf nodes only
Internal nodes Store keys only
Leaf nodes linked Yes
ur

Range queries Very efficient


Used in MySQL (InnoDB) Default
r.S
D

139
39

h
Constructing a B+ Tree of Order 3

et
ar
Input Sequence:

10, 20, 50, 40, 30, 60, 80, 70

B+ Tree Rules (Order = 3) hp


• Max 2 keys per node
• Internal nodes contain only routing keys
• Leaf nodes contain actual data and are linked sequentially
• On overflow: - Split into left ⌊m/2⌋ keys and right ⌈m/2⌉ keys - Promote first key of right
es
node to parent
Step-by-Step Insertion
Insert 10, 20
ur

[10, 20] ← leaf

Insert 50

[10, 20, 50] → overflow


r.S

Split into: [10] | [20, 50]


Promote: 20 to new root

[20]
/ \
[10] [20, 50]
D

Leaf Links: [10] → [20, 50]


Insert 40

Insert into [20, 50] → [20, 40, 50] → overflow

Split: [20] | [40, 50], promote 40

140
NITK, Release DB COURSE PLAN-2024-25

[20, 40]
/ | \
[10] [20] [40, 50]

Leaf Links: [10] → [20] → [40, 50]


Insert 30

Insert into [20] → becomes [20, 30]

h
[20, 40]
/ | \
[10] [20, 30] [40, 50]

et
Leaf Links: [10] → [20, 30] → [40, 50]
Insert 60

ar
Insert into [40, 50] → becomes [40, 50, 60] → overflow

Split: [40] | [50, 60], promote 50

Parent [20, 40] becomes [20, 40, 50] → overflow


hp
Split [20, 40, 50]: left = [20], right = [50], promote 40

[40]
/ \
[20] [50]
es
/ \ / \
[10] [20,30] [40] [50,60]

Leaf Links: [10] → [20,30] → [40] → [50,60]


Insert 80
ur

Insert into [50, 60] → becomes [50, 60, 80] → overflow

Split: [50] | [60, 80], promote 60

Parent [50] becomes [50, 60] → OK


r.S

[40]
/ \
[20] [50, 60]
/ \ / | \
[10] [20,30] [40] [50] [60,80]
D

Leaf Links: [10] → [20,30] → [40] → [50] → [60,80]


Insert 70

Insert into [60, 80] → becomes [60, 70, 80] → overflow

Split: [60] | [70, 80], promote 70

Parent [50, 60] → becomes [50, 60, 70] → overflow


(continues on next page)

141
NITK, Release DB COURSE PLAN-2024-25

(continued from previous page)

Split: [50] | [70], promote 60 to root

[40, 60]
/ | \
[20] [50] [70]
/ \ / \ | \
[10] [20,30] [40] [50][60] [70,80]

h
Final Leaf Links:

et
[10] → [20,30] → [40] → [50] → [60] → [70,80]
Use in MySQL

ar
CREATE TABLE students (
id INT PRIMARY KEY,
name VARCHAR(100),
marks INT,
INDEX idx_marks (marks) hp
);

• id: clustered B+ Tree


• idx_marks: secondary B+ Tree
Operations
es
Search - Navigate internal nodes - Binary search in leaf node
Insert - Insert into correct leaf - Split and propagate up if needed
Delete - Remove from leaf - Rebalance if needed
ur

Table 1: B+ Tree vs B-Tree


Feature B-Tree B+ Tree
Data Location All nodes Leaf nodes only
Internal Nodes Keys + data Keys only
r.S

Leaf Linked No Yes


Range Queries Moderate Fast
Used in MySQL Sometimes Default

Benefits
D

• Fast range queries


• Fewer disk reads due to high fan-out
• Used by default in RDBMS index structures
Limitations
• More complex than binary/B-tree
• More disk usage due to duplicated keys

142
40

h
Views in RDBMS

et
ar
What is a View?
A view is a virtual table that is defined by an SQL query. It does not store data itself but
provides a way to represent data from one or more base tables.
• Views behave like tables in SELECT queries.
hp
• The data in a view is fetched dynamically from the underlying tables.

Table 1: Why Use Views?


Feature Benefit
es
Simplification Encapsulate complex SQL queries
Security Expose only selected columns/rows
Reusability Reuse query logic across applications or reports
Logical Independence Hide changes in schema from front-end users
Maintenance Reduce redundancy by centralizing query logic
ur

View Syntax
CREATE VIEW
r.S

CREATE VIEW view_name AS


SELECT column1, column2
FROM table_name
WHERE condition;

SELECT FROM VIEW


D

SELECT * FROM view_name;

DROP VIEW
DROP VIEW view_name;

Example
CREATE TABLE students (
id INT,
(continues on next page)

143
NITK, Release DB COURSE PLAN-2024-25

(continued from previous page)


name VARCHAR(100),
department VARCHAR(50),
marks INT
);

CREATE VIEW high_scorers AS


SELECT name, department, marks
FROM students
WHERE marks > 80;

h
SELECT * FROM high_scorers;

et
Table 2: Types of Views
Type Description
Simple View Based on a single table, no aggregates

ar
Complex View Uses joins, aggregates, subqueries
Updatable View Allows INSERT/UPDATE/DELETE (if criteria met)
Read-only View Cannot be updated (e.g., uses GROUP BY)
Materialized View Physically stores data (not supported in MySQL)

Updatable vs Non-updatable Views


hp
• A view is updatable if:
– Based on one table
– No GROUP BY, DISTINCT, JOIN, or UNION
es
– No aggregate functions

-- Non-updatable example
CREATE VIEW dept_avg AS
SELECT department, AVG(marks)
ur

FROM students
GROUP BY department;

Views for Security


r.S

You can create restricted views to hide sensitive data:

CREATE VIEW student_public AS


SELECT name, department
FROM students;

GRANT SELECT ON student_public TO user_readonly;


D

Advantages of Views
• Encapsulate query complexity
• Support data security and access control
• Provide abstraction layer between schema and users
• Enable code reuse
Limitations of Views

144
NITK, Release DB COURSE PLAN-2024-25

• Views do not store data (except materialized views)


• May have performance cost for large joins or complex logic
• Some views are not updatable

h
et
ar
hp
es
ur
r.S
D

145
41

h
Views in AI-Driven Recruitment

et
ar
What is a View?
A view in an AI-driven recruitment system is a virtual table that presents selected and sim-
plified data drawn from one or more underlying base tables. It allows stakeholders like HR
managers, administrators, or AI dashboards to access relevant information without directly
hp
interacting with complex joins or sensitive data.
Use Case
In a recruitment platform, data may be spread across multiple tables:
• candidates: stores personal and job application details
es
• ai_scores: stores evaluation results generated by the AI engine
• jobs: contains job listings and metadata
To streamline access, a view named candidate_summary is created.
ur

CREATE VIEW candidate_summary AS


SELECT c.name, c.email, j.title AS job_title, AVG(s.score) AS avg_score
FROM candidates c
JOIN ai_scores s ON c.id = s.candidate_id
JOIN jobs j ON c.job_id = j.id
GROUP BY c.id, c.name, c.email, j.title;
r.S

This view allows users to: - See candidates’ names, emails, applied job titles - View average
AI scores for shortlisting
Diagram: View Integration
D

Diagram Description:

146
NITK, Release DB COURSE PLAN-2024-25

• Base Tables: - candidates(id, name, email, job_id) - ai_scores(candidate_id, skill, score)


- jobs(id, title, status)
• View: - candidate_summary(name, email, job_title, avg_score)
• Consumers: - Admin Dashboard - HR Manager Panel - AI Evaluation Monitor

Table 1: Advantages of Views


Advantage Description

h
Simplification Hides complexity of joins and aggregations
Security Restricts direct access to sensitive base tables
Reusability Reused across dashboards and reporting modules

et
Logical Abstraction Interface remains stable even if schema changes

Security Use
You can expose only the view to certain users:

ar
GRANT SELECT ON candidate_summary TO hr_user;

This way, HR sees evaluated summaries without needing access to raw AI scores or candidate
profiles. hp
es
ur
r.S
D

147
42

h
Security and Backup in RDBMS

et
ar
Security in RDBMS
Definition:
Security in RDBMS involves protecting the integrity, confidentiality, and accessibility of
database data through authentication, authorization, and access control.
hp
Key Mechanisms:

Table 1: Key Mechanisms


Feature Description
es
Authentication Ensures users are who they claim to be
Authorization Controls what users can access
Roles & Privileges Fine-grained access control
Views Restrict access to selected columns/rows
Encryption Secures data in storage and in transit
ur

Auditing Logs user actions and data changes


SQL Injection Prevention Validates user input and prevents code injection

Example:
r.S

CREATE USER 'readonly_user'@'localhost' IDENTIFIED BY 'securepass';


GRANT SELECT ON mydb.students TO 'readonly_user'@'localhost';

Best Practices:
• Use least privilege principle
D

• Enable access auditing


• Encrypt sensitive data
• Use secure authentication mechanisms
Backup in RDBMS
Definition:
Backup refers to creating a copy of the database for disaster recovery and data restoration
purposes.

148
NITK, Release DB COURSE PLAN-2024-25

Types of Backup:

Type Description
Full Backup Copy of entire database
Incremental Only changes since the last backup
Differential Changes since last full backup
Logical Export of SQL commands (e.g., mysqldump, pg_dump)
Physical File-level copy of database files

h
Examples:

et
# MySQL Full Backup
mysqldump -u root -p mydb > backup.sql

# PostgreSQL Backup
pg_dump mydb > backup.pgsql

ar
Restore:

mysql -u root -p mydb < backup.sql


psql mydb < backup.pgsql hp
Best Practices:
• Schedule automated backups (e.g., with cron)
• Test backup restoration regularly
• Store encrypted copies in secure, off-site locations
es
• Monitor backup processes for errors

Table 2: Summary
ur

Category Summary
Security Protect data using roles, views, encryption, and auditing
Backup Ensure data safety via regular full/incremental backups
r.S
D

149
43

h
Transactions and Related Topics in RDBMS

et
ar
What is a Transaction?
A transaction is a group of SQL operations that execute as a single unit. It either fully
completes (COMMIT) or fully aborts (ROLLBACK).

Property
hp
Table 1: ACID Properties
Description
Atomicity All operations succeed or none at all
Consistency Database moves from one valid state to another
Isolation Transactions do not affect each other
es
Durability Changes persist even after system failure

SQL Transaction Commands


ur

START TRANSACTION;
-- SQL operations
COMMIT;

-- or to undo
ROLLBACK;
r.S

Example:

START TRANSACTION;
UPDATE accounts SET balance = balance - 500 WHERE id = 1;
UPDATE accounts SET balance = balance + 500 WHERE id = 2;
COMMIT;
D

Isolation Levels

150
NITK, Release DB COURSE PLAN-2024-25

Table 2: Isolation Levels and Read Phenomena


Level Dirty Read Non-Repeatable Read Phantom Read
READ UNCOM- Yes Yes Yes
MITTED
READ COMMIT- No Yes Yes
TED
REPEATABLE No No Yes
READ

h
SERIALIZABLE No No No

et
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;

Table 3: Concurrency Issues

ar
Issue Description
Dirty Read Reading uncommitted data from another transaction
Non-Repeatable Read Re-running query gives different results
Phantom Read New rows appear in repeated queries

Savepoints
hp
Savepoints are markers within a transaction to allow partial rollback.

START TRANSACTION;
SAVEPOINT sp1;
es
UPDATE employees SET salary = salary + 1000 WHERE id = 1;
ROLLBACK TO sp1;
COMMIT;

Autocommit
ur

By default, some RDBMS (e.g., MySQL) commit each statement unless disabled:

SET autocommit = 0;
r.S

Table 4: Transactions
Concept Use
Transaction Logical unit of database operations
COMMIT Apply all changes permanently
ROLLBACK Undo changes in case of error
D

Savepoints Allow partial rollback within a transaction


Isolation Levels Manage concurrent transactions safely

151
44

h
Additional SQL Transaction-Based Questions and

et
Answers

ar
Question 1:
Write an SQL transaction to handle candidate withdrawal from a job application. Ensure that
the corresponding entry in the Application table is deleted, and if the deletion fails, rollback
hp
the changes.
Answer:

START TRANSACTION;

DELETE FROM Application


es
WHERE candidate_id = 123 AND job_id = 456;

SAVEPOINT after_delete;

-- If no error:
ur

COMMIT;

-- If error occurs:
-- ROLLBACK TO after_delete;
-- COMMIT;
r.S

Purpose: Ensures no orphaned or half-deleted application remains if deletion fails.


Question 2:
Write an SQL transaction to update a candidate’s AI evaluation score. Ensure the operation
is atomic and track the previous score using a savepoint.
D

Answer:

START TRANSACTION;

SAVEPOINT before_update;

UPDATE ai_scores
SET score = 92
WHERE candidate_id = 123 AND skill = 'Python';

(continues on next page)

152
NITK, Release DB COURSE PLAN-2024-25

(continued from previous page)


-- If update successful
COMMIT;

-- If not:
-- ROLLBACK TO before_update;
-- COMMIT;

Purpose: Allows reverting to the original score in case of an update failure.

h
Question 3:
Write an SQL transaction to assign a new job posting by inserting into the Jobs table and then
automatically logging the insertion into an audit_log table. Rollback if either step fails.

et
Answer:

START TRANSACTION;

ar
INSERT INTO Jobs (job_id, title, department)
VALUES (301, 'Data Analyst', 'AI');

SAVEPOINT after_job_insert;
hp
INSERT INTO audit_log (operation, entity, timestamp)
VALUES ('INSERT', 'Jobs', NOW());

COMMIT;

-- If failure occurs at second insert:


-- ROLLBACK TO after_job_insert;
es
-- COMMIT;

Purpose: Maintains transactional integrity while performing multi-step operations.


Question 4:
ur

Write an SQL transaction to simulate a candidate moving from the ‘shortlisted’ to the ‘se-
lected’ stage. Update the status and create a backup entry in a status_history table.
Answer:

START TRANSACTION;
r.S

UPDATE Application
SET status = 'selected'
WHERE candidate_id = 123 AND job_id = 456;

SAVEPOINT after_status_update;
D

INSERT INTO status_history (candidate_id, job_id, previous_status, new_status,␣


,→changed_on)

VALUES (123, 456, 'shortlisted', 'selected', NOW());

COMMIT;

-- If status history logging fails:


-- ROLLBACK TO after_status_update;
-- COMMIT;

153
NITK, Release DB COURSE PLAN-2024-25

Purpose: Tracks status transitions safely in a multi-step update process.


Question 5:
Write an SQL transaction to cancel all applications for a job if the job is closed. Ensure that
all deletions and status updates are part of one atomic operation.
Answer:

START TRANSACTION;

UPDATE Jobs

h
SET status = 'closed'
WHERE job_id = 456;

et
DELETE FROM Application
WHERE job_id = 456;

SAVEPOINT after_cleanup;

ar
INSERT INTO audit_log (operation, entity, details, timestamp)
VALUES ('DELETE', 'Application', 'All candidates for job 456', NOW());

COMMIT;

-- If logging fails:
-- ROLLBACK TO after_cleanup;
-- COMMIT;
hp
Purpose: Ensures the job closing process includes cleanup and logging or is completely
undone on failure.
es
Question-6:
Write an SQL transaction for an AI-based recruitment system specified in Question No. 3
that ensures the data consistency when a candidate applies for a job. The transaction should
consist of the following steps:
ur

a) Start a Transaction to ensure atomicity.


b) Insert a New record to the Application table with the candidate’s ID and job ID.
c) Use a SAVEPOINT after inserting the above record to track progress before final confir-
mation.
r.S

d) Commit the Transaction if the record is successfully inserted.


e) Rollback to the SAVEPOINT if the insertion of record fails, ensuring no incomplete
records remain.
Answer:
D

-- Step a) Start a Transaction to ensure atomicity


START TRANSACTION;

-- Step b) Insert a new record into the Application table


INSERT INTO Application (candidate_id, job_id)
VALUES (123, 456); -- Replace with actual candidate and job IDs

-- Step c) Create a SAVEPOINT after successful insertion


SAVEPOINT after_application_insert;
(continues on next page)

154
NITK, Release DB COURSE PLAN-2024-25

(continued from previous page)

-- Step d) If no error occurs, commit the transaction


COMMIT;

-- Step e) If insertion fails, rollback to the savepoint


-- ROLLBACK TO after_application_insert;
-- COMMIT;

Explanation:

h
Step Description

et
START TRANSACTION Begins an atomic transaction block
INSERT INTO Attempts to insert a candidate’s application to the database
SAVEPOINT Creates a rollback checkpoint after successful insertion
COMMIT Finalizes all changes if successful

ar
ROLLBACK TO SAVEPOINT Undoes partial changes if something goes wrong

This transaction ensures that the system maintains data integrity and avoids storing partial
or failed application data.
hp
es
ur
r.S
D

155
45

h
Transaction Management in AI-Driven Recruitment

et
ar
Context
In an AI-powered recruitment system, a candidate’s application involves multiple steps that
must be atomically committed to ensure data integrity.
Transactional Steps

START TRANSACTION;
hp
INSERT INTO candidates (...) VALUES (...);
INSERT INTO resumes (...) VALUES (...);
INSERT INTO assessments (...) VALUES (...);
es
UPDATE jobs SET applicant_count = applicant_count + 1 WHERE ...;

COMMIT;

ACID Application
ur

Table 1: ACID Properties in AI Recruitment


Property Application
Atomicity All steps (insert resume, score, count) must complete or none
r.S

Consistency Keeps job & candidate data aligned


Isolation Prevents overlapping updates to applicant_count
Durability Data is permanent once committed

Concurrency Example
D

If two candidates apply to the same job at the same time:

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;

Savepoints

SAVEPOINT after_resume;
-- AI score fails
ROLLBACK TO after_resume;

156
46

h
Serialization in RDBMS

et
ar
What is Serialization?
Serialization ensures that the concurrent execution of transactions produces the same
result as if the transactions were executed serially (one after another).
Why is it Important? hp
• Prevents concurrency issues such as dirty reads, phantom reads
• Ensures ACID compliance, especially Isolation
• Maintains consistency even in multi-user environments
Serializable Isolation Level
es

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;


START TRANSACTION;
-- queries
COMMIT;
ur

• Prevents all concurrency anomalies


• Locks rows or ranges that match WHERE conditions

Table 1: Serialization Techniques


r.S

Technique Description
Two-Phase Locking (2PL) All locks acquired before releasing any
Timestamp Ordering Transactions ordered by timestamps
Serializable Snapshot Isolation (SSI) | Prevents anomalies without explicit locks
Predicate Locking Locks ranges or sets based on conditions
D

Table 2: Trade-Offs
Pros Cons
Strong consistency Lower concurrency
Prevents all anomalies Can cause deadlocks

157
NITK, Release DB COURSE PLAN-2024-25

Concept Serialization
Goal Make concurrent transactions behave like serial execution
SQL Support SERIALIZABLE isolation level
Techniques Used 2PL, SSI, Timestamps, Predicate Locks

h
et
ar
hp
es
ur
r.S
D

158
47

h
Serialization in AI-Driven Recruitment

et
ar
What is Serialization?
In a multi-user AI-driven recruitment platform, multiple candidates might apply to the same
job simultaneously. Serialization ensures that these concurrent transactions behave as if
they were executed one after another, maintaining data consistency.
hp
Use Case
Assume two candidates (A and B) are applying to the same job posting at the same time. Each
of their application workflows includes:
• Inserting candidate record
es
• Uploading resume
• Submitting AI-evaluated assessment
• Updating job applicant count
To avoid lost updates or inconsistent scores, the database uses serialization techniques
ur

to ensure:
1. Transaction T1 (Candidate A) completes entirely.
2. Only then does Transaction T2 (Candidate B) start.
r.S

This maintains a conflict-free, consistent state in the job table.


Diagram: Serialized Transaction Flow
D

159
NITK, Release DB COURSE PLAN-2024-25

h
et
ar
hp
es
ur
r.S
D

Explanation:
• T1 and T2 represent two application processes.
• A serialization control layer ensures only one transaction modifies the job data at a
time.

160
NITK, Release DB COURSE PLAN-2024-25

• Transaction queueing or locking mechanisms are used to enforce this order.


• Final database state reflects consistent and isolated updates from each candidate.
SQL Implementation

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

START TRANSACTION;
-- Candidate A operations (Insert, Upload, Score, Update)
COMMIT;

h
-- Candidate B's transaction begins only after A completes

et
Benefits
• Prevents race conditions in updating shared tables
• Ensures AI evaluations and candidate metadata remain correct

ar
• Guarantees fairness in job application ordering

hp
es
ur
r.S
D

161

You might also like