IT252 MNotes Complete 18042025
IT252 MNotes Complete 18042025
TECHNOLOGY KARNATAKA
SURATHKAL
DEPARTMENT OF INFORMATION TECHNOLOGY
BACHELOR OF TECHNOLOGY
IN
h
INFORMATION TECHNOLOGY
et
ar
hp
DATABASE SYSTEMS
es
IT252 MINOR
SEMESTER IV
ur
h
et
ar
1 DATABASE SYSTEMS 1
2 Course Description 2
3 Course Structure
hp
3.1 Expected Learning Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
8
4 Evaluation Criteria 10
5 Introduction to DBMS 11
es
6 Types of DBMS 13
7 DBMS Architecture 15
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
8.2 Key Features of RDBMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
8.3 Components of RDBMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
8.4 Concepts in RDBMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
r.S
10 Constraints in RDBMS 32
D
13 Transactions in RDBMS 43
i
15 SMART Health Management Database 53
15.1 Sample Data Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
17 Relational Algebra 71
17.1 Practical scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
17.2 Relational Algebra Queries and SQL Equivalents . . . . . . . . . . . . . . . . . 76
h
18 Cross Product and Join in RDBMS 80
et
20 AI Matching Recruitment System with Conventional DBMS Models 88
21 Server Hierarchy 94
ar
22 Functional Dependency in RDBMS 95
22.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
22.2 Practical Use Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
ii
41 Views in AI-Driven Recruitment 146
h
46 Serialization in RDBMS 157
et
ar
hp
es
ur
r.S
D
iii
1
h
DATABASE SYSTEMS
et
ar
Sub Code IT252
Evaluation Criteria ASSIGMENT + QUIZ + MIDTERM +
PROJECT+ ENDTERM
L-T-P 3-0-2
hp
Total Hours 35(T) +20(P)
Total Marks 100
Credits 04
Exam Hours 03
es
ur
r.S
D
1
2
h
Course Description
et
ar
Database Management Systems (DBMS) is a multidisciplinary field that intersects com-
puter science, information technology, and data engineering. It focuses on the design,
development, and management of systems that store, retrieve, and process large volumes
of structured data efficiently. DBMS serves as the backbone for applications across vari-
ous domains, including science, healthcare, finance, and e-commerce, enabling data-driven
hp
decision-making and operational excellence.
DBMS encompasses a wide range of concepts, from the fundamental principles of relational
models to advanced topics in distributed systems and big data management. Key areas in-
clude data modeling, normalization, indexing, query optimization, transaction management,
and database security. The field integrates tools and techniques from mathematics, pro-
es
gramming, and system design to ensure reliable, scalable, and high-performance data man-
agement.
The applications of DBMS are vast and diverse. They range from managing patient records
in healthcare systems to powering recommendation engines in e-commerce platforms, and
ur
from supporting financial transactions to enabling real-time analytics in IoT systems. Modern
DBMS platforms, such as MySQL, PostgreSQL, MongoDB, and Oracle, provide the foundation
for these applications, leveraging computational tools and frameworks like SQL, NoSQL, and
cloud-based database services.
This course introduces students to the foundational concepts and practical skills required to
r.S
design and implement database systems. Topics covered include data modeling using Entity-
Relationship diagrams, relational algebra, Structured Query Language (SQL), schema design,
normalization, indexing, and query optimization. Advanced topics, such as transactions, con-
currency control, distributed databases, and NoSQL systems, are also explored.
The course emphasizes real-world applications through hands-on projects and case studies.
Students will design and implement database solutions for complex scenarios, such as health-
D
care management, e-commerce, and logistics, using modern database tools. Ethical consid-
erations, including data privacy, security, and sustainability, are integral to understanding
the impact of database systems in a connected world.
Through this course, students will develop a deep understanding of database concepts and
acquire practical skills to manage data effectively. This knowledge equips them to solve com-
plex data challenges and prepares them for roles in academia, industry, and research, where
data plays a pivotal role in innovation and decision-making.
Course Educational Objectives (CEOs) for Database Management Systems (DBMS):
2
NITK, Release DB COURSE PLAN-2024-25
h
Study of data modeling techniques such as Entity-Relationship (ER) diagrams and schema
normalization.
et
Hands-on practice with database query languages, including SQL for relational databases
and NoSQL for non-relational databases.
Understanding indexing, query optimization, and performance tuning for efficient data re-
trieval.
ar
To enable students to apply database management techniques to practical applications:
Applications in healthcare, e-commerce, logistics, and financial systems, showcasing the ver-
satility of DBMS. hp
Case studies involving real-world scenarios like customer relationship management, resource
scheduling, and data analysis.
Integration of database systems with programming languages and frameworks for end-to-end
solutions.
To promote an understanding of ethical considerations and responsible data management
es
practices:
Recognizing the importance of data privacy, security, and integrity in database systems.
Exploring sustainable database management practices, including energy-efficient storage
and minimizing redundancy.
ur
Understanding modern trends like NoSQL databases, graph databases, and real-time analyt-
ics.
Studying research challenges, including scalability, fault tolerance, and the integration of
DBMS with big data and machine learning systems.
Course Outcomes:
D
3
NITK, Release DB COURSE PLAN-2024-25
CO4: Understand the ethical implications of database management, emphasizing data secu-
rity, privacy, compliance, and sustainability in designing and managing database systems.
h
et
ar
hp
es
ur
r.S
D
4
3
h
Course Structure
et
ar
Week 1: Introduction to DBMS and ER Modeling
Topics:
• Introduction to Databases: - What is DBMS? - Types of databases (Relational, NoSQL,
etc.). hp
• Entity-Relationship (ER) Modeling: - Entities, attributes, and relationships. - Primary
keys, foreign keys, and cardinality.
Activities:
• Discuss SHMS(Smart Health Management System) as a real-world problem: - Define
es
entities (e.g., Patient, Doctor, Appointment). - Highlight relationships (e.g., One-to-Many
between Doctor and Appointment).
• Task: Create an ER diagram for SHMS.
Week 2: Schema Design and Normalization
ur
Topics:
• Schema Design: - Translating an ER diagram into relational tables.
• Normalization: - 1NF, 2NF, and 3NF concepts. - Avoid redundancy and dependency
issues.
r.S
Activities:
• Convert the SHMS ER diagram into a relational schema.
• Normalize SHMS tables: - Example: Split the Patient table into Patient and Pa-
tient_Contact.
D
• Task: Design schemas for all SHMS entities (Patient, Doctor, Appointment, etc.).
Week 3: SQL Basics (DDL, DML)
Topics:
• Data Definition Language (DDL): - Creating and modifying tables.
• Data Manipulation Language (DML): - Insert, update, delete, and retrieve data.
Activities:
• Hands-On:
5
NITK, Release DB COURSE PLAN-2024-25
SELECT *
FROM Appointment
WHERE Doctor_ID = 101;
h
• Task: Write SQL queries for basic CRUD operations on SHMS.
Week 4: Advanced SQL Queries
et
Topics:
• Joins (INNER, LEFT, RIGHT).
• Aggregate functions (COUNT, SUM, AVG).
ar
• Grouping and sorting.
Activities:
• Teach joins with SHMS queries: - Example: Retrieve all confirmed appointments with
hp
patient and doctor details:
Topics:
• Indexes: - Single-column and composite indexes.
• Query optimization techniques: - Analyzing query execution plans.
r.S
Activities:
• Add indexes to SHMS tables:
– Example:
6
NITK, Release DB COURSE PLAN-2024-25
Activities:
• Create stored procedures for SHMS:
– Example: Book an appointment:
h
• Practice calling procedures:
et
CALL BookAppointment(1, 101, '2024-12-21');
• Task: Write stored procedures for common SHMS operations (e.g., updating patient
details).
ar
Week 7: Triggers
Topics:
• What are Triggers? - Types of triggers (BEFORE, AFTER).
• Automating actions with triggers. hp
Activities:
• Create SHMS triggers:
– Example: Log updates to patient contact:
es
CREATE TRIGGER LogPatientUpdate
AFTER UPDATE ON Patient
FOR EACH ROW
BEGIN
INSERT INTO Patient_Audit (Patient_ID, Old_Contact, New_Contact)
VALUES (OLD.Patient_ID, OLD.Contact, NEW.Contact);
ur
END;
Week 8: Transactions
Topics:
• What are Transactions?
– ACID properties.
D
BEGIN;
7
NITK, Release DB COURSE PLAN-2024-25
COMMIT;
h
Topics:
• What are Views? - Advantages and limitations.
et
• Creating and using views.
Activities:
• Create SHMS views:
ar
– Example: Active appointments:
• Task: Students create views for billing summaries and patient histories.
Week 10: Testing and Deployment
es
Topics:
• Testing strategies: - Unit testing for procedures and triggers. - Integration testing for
transactions.
ur
Textbooks
1.Raghu Ramakrishnan, Johannes Gehrke, Database Management Systems, McGraw-Hill,
2014
2.R. Elmasri and S.B Navathe , Fundamentals of Database Systems,7th Ed., Pearson, 2017
3.Silberschatz, Korth A.F., Sudarshan S., Database System Concepts, 6th Ed., McGraw-
Hill,2010.
h
et
ar
hp
es
ur
r.S
D
h
Evaluation Criteria
et
ar
Assignments (10%);
Quiss(1) (10%);
Mid Term(20%);
Project (20%);
Final Exam (40%).
hp
Week Content
Quiz Covers Weeks 1-5
es
Midterm Examination Covers Weeks 1-7
Project Evaluation Abstract, Project Report and Presentation(2)
Endterm Examination Covers Weeks 1-15
ur
r.S
D
10
5
h
Introduction to DBMS
et
ar
A Database Management System (DBMS) is software that helps in the creation, orga-
nization, storage, retrieval, and management of data in databases. Think of it as an
advanced filing system that doesn’t just store data but also ensures that the data is secure,
consistent, and easily accessible.
Why is a DBMS Important?
hp
• Efficient Data Handling: Instead of manually managing large amounts of data, a DBMS
automates the process, making it faster and less error-prone.
• Centralized Management: All data is stored in a central system, making it easy to
update and maintain.
es
• Multiple User Access: DBMS supports many users working with the data at the same
time without issues.
Key Functions of DBMS:
ur
In a banking system, the DBMS helps store customer details and allows retrieval␣
,→of account information whenever needed.
D
11
NITK, Release DB COURSE PLAN-2024-25
,→duplicated or lost.
3. Concurrency Control
• Definition: Allows multiple users to access or modify the database at the same time
without conflicts.
• Why It’s Needed: Imagine two people trying to book the same seat on a flight. The
h
DBMS ensures only one booking is confirmed, preventing double booking.
• Mechanisms Used: Locking, transaction management, and isolation levels.
et
4. Backup & Recovery
• Backup: DBMS can automatically create backups of the data at regular intervals.
• Recovery: If there’s a system crash, power failure, or hardware issue, the DBMS
helps restore the data to its last known good state.
ar
Example:
In hospitals, patient records are critical. Even if the server fails, DBMS␣
,→recovery features ensure data isn’t lost.
5. Data Independence
hp
• Definition: Data can be modified without affecting the programs or applications
that access it.
• Types of Data Independence:
es
– Logical Data Independence: Changes in the logical structure (like adding new
fields) don’t affect application programs.
– Physical Data Independence: Changes in physical storage (like moving data
from one server to another) don’t impact how applications access the data.
ur
• Why It’s Useful: This makes system upgrades or changes easy without needing to
rewrite application code.
r.S
D
12
6
h
Types of DBMS
et
ar
DBMS can be categorized into several types based on their data models and archi-
tecture:
1. Hierarchical DBMS
• Structure: Data is organized in a tree-like structure with parent-child relationships.
hp
Each parent node can have multiple child nodes, but each child node has only one
parent. This rigid hierarchy ensures a clear, organized flow of data.
• Advantages: Fast data retrieval, simple relationships, and efficient for handling
large volumes of data with clear hierarchies.
es
• Disadvantages: Inflexible structure, difficult to reorganize data, and complex to
handle many-to-many relationships.
• Example: IBM Information Management System (IMS), which is widely used in
banking and telecommunications industries.
ur
2. Network DBMS
• Structure: Uses a graph structure allowing multiple parent-child relationships,
forming a network model. This makes it more flexible than the hierarchical model,
as data can be accessed through various paths.
• Advantages: Supports complex relationships, faster traversal for certain queries,
r.S
13
NITK, Release DB COURSE PLAN-2024-25
h
ods (functions) that operate on the data.
• Advantages: Seamless integration with object-oriented programming, reusable
et
code, and better handling of complex data.
• Disadvantages: Less mature than RDBMS, limited community support, and com-
patibility issues with existing relational data.
• Example: ObjectDB, db4o, used in applications requiring complex data represen-
ar
tation, such as CAD systems.
5. NoSQL DBMS (Non-Relational DBMS)
• Purpose: Designed for handling unstructured and semi-structured data. It is opti-
hp
mized for high performance, scalability, and flexibility, making it ideal for Big Data
and real-time web applications.
• Applications: Widely used in social media platforms, IoT applications, e-commerce
sites, and distributed systems.
• Categories:
es
– Key-Value Stores: Data is stored as key-value pairs, enabling fast lookups. Ex-
amples: Redis, DynamoDB.
– Document Stores: Stores data in JSON, BSON, or XML documents, allowing
flexible schemas. Examples: MongoDB, CouchDB.
ur
– Column Stores: Data is organized in columns rather than rows, optimized for
analytical queries. Examples: Apache Cassandra, HBase.
– Graph Databases: Focuses on relationships between data points, using nodes
and edges to represent and query data efficiently. Examples: Neo4j, ArangoDB.
r.S
• Advantages: High scalability, flexible data models, optimized for distributed envi-
ronments.
• Disadvantages: Less consistency compared to RDBMS (eventual consistency
model), lack of standardized query language.
D
14
7
h
DBMS Architecture
et
ar
DBMS architecture defines the structure of a database system, focusing on how the data is
stored, processed, and accessed. It determines how clients, servers, and databases interact
with each other.
1. Single-Tier Architecture hp
• Description: In this architecture, the database and the application reside on the
same machine. Users interact directly with the database without any intermediary.
• Usage: Suitable for small-scale applications where the data load is minimal, and a
single user or a few users access the system.
• Advantages: - Simple to design and implement. - Fast data access as everything
es
runs on the same system.
• Disadvantages: - Limited scalability. - Poor security because the database is di-
rectly accessible.
• Example: Microsoft Access, where both the database and the application run on a
ur
single computer.
2. Two-Tier Architecture
• Description: This model consists of two layers: the client and the server. The client
(user interface) communicates directly with the database server using APIs or query
r.S
15
NITK, Release DB COURSE PLAN-2024-25
• Description: In this architecture, there are three distinct layers: the client layer,
application layer, and database layer. This separation provides better security, per-
formance, and scalability.
• Components: - Client Layer (User Interface): The front-end application that in-
teracts with the user. It sends requests to the application server. - Application
Layer (Business Logic): Processes the client requests, performs necessary com-
putations, and interacts with the database. This layer ensures that business rules
are enforced. - Database Layer (Data Storage & Management): Manages data
storage, retrieval, and transactions. It responds to requests from the application
h
layer.
• Advantages: - Enhanced security as the client cannot directly access the database.
- High scalability, suitable for large enterprise systems. - Easier maintenance and
et
updates as changes can be made in the middle tier without affecting the client or
database.
• Disadvantages: - Increased complexity in development and maintenance. - Re-
quires more resources compared to single-tier or two-tier architectures.
ar
• Examples: Enterprise applications like banking systems, e-commerce platforms,
and ERP systems.
Advantages of DBMS: hp
DBMS offers several benefits that enhance data management and system efficiency:
Data Consistency: Ensures data remains accurate and consistent across the database. This
is achieved through integrity constraints, normalization, and transaction management, pre-
venting anomalies and redundant data.
Security & Authorization: Provides role-based access control, allowing administrators to
es
define who can access, modify, or delete data. It includes encryption, authentication, and
authorization mechanisms to protect sensitive information.
Efficient Query Processing: Optimizes data retrieval using indexing, query optimization
techniques, and advanced search algorithms. This reduces the time required to fetch large
ur
Concurrency Control: Supports multiple users accessing the database simultaneously with-
out conflicts. This is managed through locking mechanisms, isolation levels, and transaction
controls to maintain data integrity in multi-user environments.
Disadvantages of DBMS
Despite its many advantages, DBMS has some limitations that organizations should consider:
D
High Cost: Implementing a DBMS can be expensive due to licensing fees, hardware re-
quirements, and ongoing maintenance costs. Large-scale systems also require investments
in data storage, servers, and network infrastructure.
Complexity: DBMS requires skilled database administrators (DBAs) to manage, configure,
and optimize the system. This includes tasks like performance tuning, security management,
and data migration, which can be complex and resource-intensive.
Performance Overhead: DBMS can introduce performance overhead compared to simple
file-based systems, especially when handling small datasets. This is due to additional layers
16
NITK, Release DB COURSE PLAN-2024-25
of abstraction, security checks, and transaction processing, which can slow down operations
in lightweight applications.
DBMS vs. File System:
The following table highlights the key differences between DBMS and File System based on
critical features:
h
Stor- & columns predefined formats
age
Secu- Provides user authentication & authoriza- No built-in security mechanisms
et
rity tion mechanisms
Redun- Eliminates redundancy via normalization High redundancy as data may be
dancy techniques duplicated across files
Con- Supports multiple users simultaneously Limited concurrency support with
ar
cur- through transaction management risk of conflicts
rency
Query- Uses SQL for efficient querying & data ma- Requires manual searching with-
ing nipulation out advanced query tools
17
NITK, Release DB COURSE PLAN-2024-25
MongoDB:
Document-based DBMS storing data in JSON-like BSON format.
Highly flexible and scalable, ideal for big data applications.
Used in content management systems, real-time analytics, and IoT applications.
Redis:
In-memory key-value store used for caching, session management, and real-time analytics.
Extremely fast due to in-memory data processing.
h
Supports data structures like strings, hashes, lists, sets, and sorted sets.
Cassandra:
et
Distributed, column-oriented NoSQL database.
Designed for high availability and scalability in large-scale applications.
Commonly used by tech giants for handling massive amounts of data.
ar
Neo4j:
Graph-based DBMS optimized for managing complex relationships.
Uses graph structures with nodes, edges, and properties.
hp
Ideal for social networks, recommendation engines, and fraud detection systems.
es
ur
r.S
D
18
8
h
Relational Database Management System (RDBMS)
et
ar
8.1 Introduction
• Data stored in tables – Organized into rows (records) and columns (attributes).
• Structured Query Language (SQL) – Used for querying and managing the database.
• Relationships – Tables can be linked using Primary Keys (PK) and Foreign Keys
(FK).
• ACID Compliance – Ensures reliable transactions with Atomicity, Consistency, Iso-
D
19
NITK, Release DB COURSE PLAN-2024-25
h
• Triggers – Automated execution of SQL when certain conditions are met.
• Transactions – Ensures data consistency using ACID properties.
et
8.4 Concepts in RDBMS
ar
Schema:
A Schema is the structure of a database that defines how data is organized. It includes
definitions of tables, fields, relationships, constraints, indexes, views, and other elements.
hp
• A database schema is like a blueprint for organizing data.
• It does not store data but defines its organization.
Example:
Entity:
An Entity is any real-world object that has attributes and can be represented in a database.
• Entities have attributes that describe their properties.
r.S
Relation Schema:
A Relation Schema is the structure of a relation (table), which includes:
• The name of the relation (table name)
h
• The attributes (columns) in the relation
et
• The data types of attributes
Example:
For the Students relation, the schema is:
ar
Students(student_id: INT, name: VARCHAR(100), email: VARCHAR(100))
Weak Entity:
A Weak Entity is an entity that cannot be uniquely identified by its own attributes alone
and relies on a Strong Entity through a Foreign Key.
hp
• A weak entity has a partial key.
• It must have a relationship with a strong entity.
Example:
A Dependent entity in an Employee-Dependent relationship:
es
CREATE TABLE Employee (
emp_id INT PRIMARY KEY,
name VARCHAR(100)
);
ur
h
Rules of Relational Databases
et
ar
Codd’s Rules for Relational Database Management Systems (RDBMS):
Codd’s 12 rules, proposed by Dr. Edgar F. Codd, define the requirements for a database
management system to be considered truly relational.
These rules serve as a benchmark to evaluate the functionality of relational database systems.
hp
Rule 1: The Information Rule:
Description: All information in a database is represented explicitly using values in tables.
Example: Customer data like name, address, and phone number is stored in rows and columns
within tables, not in proprietary formats.
es
Use Case: Storing Customer Data in a Relational Format
Scenario: An e-commerce platform needs to store customer details such as name, email, and
address.
Implementation in MySQL:
ur
Address TEXT
);
Outcome: All data is stored explicitly in tabular form, ensuring adherence to this rule.
D
22
NITK, Release DB COURSE PLAN-2024-25
Implementation in PostgreSQL:
SELECT Email
FROM Customer
WHERE Customer_ID = 1;
Outcome: Data is uniquely accessible using the table name (Customer), primary key (Cus-
tomer_ID), and column name (Email).
Rule 3: Systematic Treatment of NULL Values:
h
Description: NULL values (representing missing or inapplicable information) must be sys-
tematically handled.
et
Example: A NULL value in the Phone_Number column means the customer’s phone number
is unknown, but it must not cause unexpected errors in queries.
Use Case: Handling Missing Data
Scenario: A customer doesn’t provide a phone number during registration.
ar
Implementation in Oracle:
SELECT TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = 'my_database';
D
Outcome: Metadata is accessible through SQL queries, ensuring compliance with this rule.
Rule 5: Comprehensive Data Sub-Language Rule:
Description: The system must support a single, comprehensive language (like SQL) for all
operations, including querying, updating, and defining data.
Example: SQL is used for creating tables (CREATE), querying data (SELECT), and modifying
data (UPDATE).
Use Case: Managing Data with a Single Language
23
NITK, Release DB COURSE PLAN-2024-25
h
VALUES (101, 1, 250.75);
et
Outcome: All database operations (creation, manipulation, querying) are performed using
SQL.
Rule 6: View Updatability Rule:
ar
Description: Any view (virtual table) derived from base tables should be updatable if it is
theoretically possible.
Example: A view showing active customers (SELECT * FROM Customer WHERE Status =
‘Active’) must allow updates to the underlying Customer table.
hp
Use Case: Updating a View
Scenario: Update customer email via a view that filters active customers.
Implementation in PostgreSQL:
UPDATE ActiveCustomers
ur
Outcome: The update is reflected in the base table, maintaining compliance with this rule.
Rule 7: High-Level Insert, Update, and Delete:
r.S
Description: The system must support high-level operations on sets of rows, not just one row
at a time.
Example: Update multiple rows in one query:
Use Case: Bulk Updating Employee Salaries
D
Scenario: Increase the salaries of all employees in the “IT” department by 10%.
Implementation in Oracle:
UPDATE Employee
SET Salary = Salary * 1.1
WHERE Department = 'IT';
24
NITK, Release DB COURSE PLAN-2024-25
Description: Changes to the physical storage of data must not affect how data is accessed at
the logical level.
Example: Moving the Customer table to a different disk or partition does not affect SQL
queries accessing it.
Use Case: Moving Data to a Different Storage System
Scenario: Migrate the Orders table to a different disk without affecting queries.
Implementation in MySQL:
h
The table is moved physically, but queries like SELECT * FROM Orders; continue to work
seamlessly.
Outcome: Changes in physical storage do not impact logical queries.
et
Rule 9: Logical Data Independence:
Description: Changes to the logical structure (schema) of a database must not affect applica-
tions accessing the data.
ar
Example: Adding a new column Middle_Name to the Customer table should not break existing
queries.
Use Case: Adding a New Column Without Breaking Existing Queries
queries.
Implementation in PostgreSQL:
hp
Scenario: Add a “LoyaltyPoints” column to the Customer table without affecting existing
Example: A FOREIGN KEY ensures that a value in the Order table’s Customer_ID column
exists in the Customer table.
Use Case: Enforcing a Foreign Key Constraint
Scenario: Ensure that every order references a valid customer.
r.S
Implementation in Oracle:
25
NITK, Release DB COURSE PLAN-2024-25
h
and security constraints of the database.
Example: Even if accessing a database through an API, constraints like primary keys must
still be enforced.
et
Use Case: Enforcing Constraints Even in Direct API Access
Scenario: A developer inserts data via an API that directly interacts with the database.
Implementation in MySQL:
ar
Primary key and foreign key constraints ensure that invalid data cannot bypass rules, even
when using APIs.
Outcome: The system enforces integrity at all access levels.
hp
Rule MySQL PostgreSQL Oracle
Information Rule Fully supported Fully supported Fully supported
Guaranteed Access Rule Fully supported Fully supported Fully supported
NULL Handling Basic Advanced Advanced
es
Online Catalog Comprehensive Extensive Extensive
Comprehensive Language SQL SQL + PL/pgSQL SQL + PL/SQL
View Updatability Limited Flexible Flexible
High-Level Operations Standard Standard Advanced (MERGE)
Physical Independence Fully supported Fully supported Fully supported
ur
26
NITK, Release DB COURSE PLAN-2024-25
Data Integrity: Emphasize strong data integrity and consistency through systematic con-
straints.
Flexibility: Ensure logical and physical independence, allowing changes without disrupting
users or applications.
Usability: Advocate for comprehensive language support and accessible data structures, mak-
ing relational databases user-friendly.
Practical Relevance:
While no commercial database strictly adheres to all 12 rules, relational databases like
h
MySQL, PostgreSQL, and Oracle Database implement most of them, ensuring robust and
reliable data management.
et
9.1 Relationships in RDBMS
ar
Cardinality:
Cardinality defines the number of instances in one table that can be associated with in-
stances in another table.
Types of Cardinality: hp
1. One-to-One (1:1) – Each record in Table A is linked to one record in Table B.
2. One-to-Many (1:M) – Each record in Table A can be linked to multiple records in Table
B.
3. Many-to-Many (M:N) – Each record in Table A can be linked to multiple records in
es
Table B and vice versa.
Examples:
1. One-to-One:
ur
2. One-to-Many:
3. Many-to-Many:
Types of Relationships:
• One-to-One (1:1) – Each record in Table A has exactly one related record in Table B.
• One-to-Many (1:M) – A record in Table A can have multiple related records in Table B.
• Many-to-Many (M:N) – Multiple records in Table A relate to multiple records in Table
B. - Requires a junction table.
h
Keys in Relational Database Management Systems (RDBMS) play a vital role in ensuring
et
data integrity, uniqueness, and relationships between tables.
Types of Keys in RDBMS:
• Primary Key (PK) – Uniquely identifies a record in a table.
ar
• Foreign Key (FK) – Establishes relationships between tables.
• Unique Key – Ensures column values are unique (but allows NULL).
• Composite Key – A combination of multiple columns as a primary key.
hp
• Candidate Key - A minimal set of attributes that can uniquely identify a row.
• Super Key - A superset of a *Candidate Key.
Primary Key (PK):
A Primary Key is a column (or a set of columns) that uniquely identifies each record in a
table.
es
• Must be unique.
• Cannot contain NULL values.
• A table can have only one primary key.
ur
Example:
CREATE TABLE Students (
student_id INT PRIMARY KEY, -- Unique and NOT NULL
name VARCHAR(100),
r.S
h
Unique Key:
A Unique Key constraint ensures that all values in a column are unique but allows NULL
et
values.
• A table can have multiple unique keys.
• Unlike the Primary Key, it permits NULL values.
ar
Example:
Composite Key:
A Composite Key is a combination of multiple columns that uniquely identifies a row in a
table.
es
• Used when a single column is not sufficient to uniquely identify records.
• The combination of columns must be unique.
Example:
ur
);
Candidate Key:
A Candidate Key is a minimal set of attributes that can uniquely identify a row in a table. A
table can have multiple candidate keys, but only one is chosen as the Primary Key.
• Minimal means that removing any attribute from the key would make it non-unique.
D
Super Key:
A Super Key is a superset of a Candidate Key. It includes additional attributes that are not
necessary for uniqueness but still uniquely identify a row.
• A Candidate Key is the minimal version of a Super Key.
h
• Super Keys may contain extra attributes that do not contribute to uniqueness.
Example:
et
CREATE TABLE Customers (
customer_id INT,
email VARCHAR(100),
phone VARCHAR(20),
ar
address VARCHAR(255),
PRIMARY KEY (customer_id),
UNIQUE (email)
);
Here:
• {customer_id} is a Candidate Key.
hp
• {customer_id, email, phone} is a Super Key (it contains extra attributes but still
uniquely identifies a row).
es
Comparison of Keys:
NULLs?
Pri- Yes No One per table Uniquely identifies each
mary record
Key
Foreign No Yes Multiple Establishes relationships
r.S
h
);
et
enrollment_id INT PRIMARY KEY,
student_id INT,
course_id INT,
FOREIGN KEY (student_id) REFERENCES Students(student_id)
);
ar
CREATE TABLE Orders (
order_id INT,
product_id INT,
quantity INT, hp
PRIMARY KEY (order_id, product_id) -- Composite Key
);
es
ur
r.S
D
h
Constraints in RDBMS
et
ar
Constraints in a Relational Database Management System (RDBMS) are rules applied
to table columns to ensure data integrity and accuracy. They restrict the type of data that
can be stored and maintain the consistency of the database.
Types of Constraints in RDBMS hp
1. PRIMARY KEY Constraint
• A PRIMARY KEY uniquely identifies each record in a table.
• It must contain unique values and cannot be NULL.
• A table can have only one PRIMARY KEY, which may consist of single or
es
multiple columns (Composite Key).
);
emp_id INT,
FOREIGN KEY (emp_id) REFERENCES Employees(emp_id)
);
3. UNIQUE Constraint
• Ensures that all values in a column are unique.
• Unlike PRIMARY KEY, a table can have multiple UNIQUE constraints.
• NULL values are allowed unless specified otherwise.
32
NITK, Release DB COURSE PLAN-2024-25
h
of Birth).
et
product_name VARCHAR(100) NOT NULL,
price DECIMAL(10,2) NOT NULL
);
ar
5. CHECK Constraint
• Defines a condition that must be met before inserting or updating data.
• Helps enforce business rules (e.g., age must be > 18).
);
student_id INT PRIMARY KEY,
age INT CHECK (age >= 18)
hp
6. DEFAULT Constraint
es
• Assigns a default value if no value is provided during insertion.
Example (MySQL):
Example (PostgreSQL):
33
NITK, Release DB COURSE PLAN-2024-25
Constraint Description
PRIMARY KEY Uniquely identifies each row (must be unique and NOT
NULL).
FOREIGN KEY Ensures referential integrity between two tables.
UNIQUE Ensures all values in a column are unique.
NOT NULL Ensures a column cannot have NULL values.
CHECK Restricts values based on a condition.
DEFAULT Provides a default value if no value is specified.
h
AUTO_INCREMENT / SE- Automatically generates unique numbers for a column.
RIAL
et
Why Use Constraints?
✓ Ensures data integrity and accuracy
✓ Prevents invalid data entry
ar
✓ Enforces business rules at the database level
✓ Reduces the need for manual validation in applications
hp
es
ur
r.S
D
34
11
h
Structured Query Language (SQL)
et
ar
Structured Query Language (SQL) is used to interact with RDBMS.
Structured Query Language (SQL) is a standard language used to interact with Relational
Database Management Systems (RDBMS). It allows users to create, manipulate, and
retrieve data efficiently.
Key Features of SQL:
r.S
• Declarative Language – Users specify what they want, and the system determines how
to execute it.
• Standardized Language – Used across multiple RDBMS platforms like MySQL, Post-
greSQL, SQL Server, and Oracle.
• Powerful Query Capabilities – Supports filtering, aggregation, and joins for complex
data retrieval.
D
• Data Integrity & Security – Includes constraints, transactions, and access control
mechanisms.
• Scalability & Performance – Optimized for handling large datasets efficiently.
Types of SQL Commands:
SQL is categorized into five main types:
1. Data Definition Language (DDL) – Defines and modifies database structure.
35
NITK, Release DB COURSE PLAN-2024-25
h
name VARCHAR(100),
email VARCHAR(100) UNIQUE
);
et
2. Data Manipulation Language (DML) – Handles data operations.
• INSERT – Adds new records.
• UPDATE – Modifies existing records.
ar
• DELETE – Removes records.
Example:
INSERT INTO Students (student_id, name, email) VALUES (1, 'Alice', 'alice@example.
hp
,→com');
BEGIN TRANSACTION;
UPDATE Students SET email = '[email protected]' WHERE student_id = 1;
ROLLBACK;
h
Importance of SQL in RDBMS:
• Data Management – Efficiently handles structured data.
et
• Data Integrity & Security – Prevents unauthorized access and maintains consistency.
• Scalability – Supports large-scale applications and complex queries.
• Standardization – Universally accepted across different database systems.
ar
11.2 SQL Basics
hp
Introduction to SQL: creating tables, basic queries, and DML operations (INSERT, UPDATE,
DELETE).
1. Creating Tables in SQL:
Tables are the fundamental building blocks of a relational database, where data is stored in
rows and columns.
es
Syntax for Creating a Table
Explanation
r.S
Example
This creates an Employees table with: - A unique EmployeeID as the primary key. - First and
last names (both required). - Date of birth and salary with specific data types.
2. Basic SQL Queries:
SQL queries are used to retrieve data from tables.
SELECT Statement
Used to retrieve specific columns or all columns from a table.
Syntax
h
SELECT column1, column2, ...
FROM table_name
et
[WHERE condition]
[ORDER BY column [ASC|DESC]];
Example
ar
-- Retrieve all columns
SELECT * FROM Employees;
3. DML Operations:
Data Manipulation Language (DML) commands are used to modify data within tables. These
ur
Syntax
Example
D
B. UPDATE Statement
Used to modify existing records in a table.
Syntax
UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;
Example
h
-- Update salary for a specific employee
et
UPDATE Employees
SET Salary = 80000.00
WHERE EmployeeID = 1;
-- Update salaries for all employees with salary below a certain amount
ar
UPDATE Employees
SET Salary = Salary + 5000.00
WHERE Salary < 70000.00;
C. DELETE Statement
Used to remove records from a table.
hp
Syntax
Example
• Constraints: Use constraints like PRIMARY KEY, NOT NULL, and FOREIGN KEY to
maintain data consistency.
• WHERE Clause: Always use WHERE with UPDATE and DELETE to avoid unintentional
changes to all rows.
• SELECT *: Avoid using SELECT * in production; explicitly list required columns for better
performance.
The ALTER TABLE statement in MySQL is used to modify the structure of an existing table.
You can use it to add, modify, or delete columns, as well as to add or remove constraints.
1. Add a New Column:
To add a new column to an existing table:
Syntax
h
ALTER TABLE table_name
ADD column_name datatype [constraints];
et
Example
ar
ADD Email VARCHAR(100);
Example
es
-- Change the datatype of 'Salary' to FLOAT
ALTER TABLE Employees
MODIFY Salary FLOAT;
3. Rename a Column:
ur
Example
4. Drop a Column:
To delete a column from an existing table:
Syntax
Example
5. Add Constraints:
To add constraints to an existing table:
Syntax
h
ADD CONSTRAINT constraint_name constraint_type (column_name);
Example
et
-- Add a UNIQUE constraint to the 'Email' column
ALTER TABLE Employees
ADD CONSTRAINT unique_email UNIQUE (Email);
ar
-- Add a FOREIGN KEY constraint
ALTER TABLE Employees
ADD CONSTRAINT fk_department_id FOREIGN KEY (DepartmentID) REFERENCES␣
,→Departments(DepartmentID);
6. Drop Constraints:
hp
To remove constraints from an existing table:
Syntax
Example
Key Considerations:
• Data Loss: Dropping or modifying columns can lead to data loss.
r.S
• Dependent Objects: Ensure that indexes or constraints tied to the columns are consid-
ered before modification.
• MySQL-Specific Syntax: The CHANGE keyword is unique to MySQL for renaming
columns.
D
h
ACID Properties in RDBMS
et
ar
• Atomicity – A transaction is all or nothing.
• Consistency – The database remains in a valid state.
• Isolation – Multiple transactions do not interfere with each other.
hp
• Durability – Once committed, data is permanently recorded.
es
ur
r.S
D
42
13
h
Transactions in RDBMS
et
ar
A transaction in SQL is a sequence of operations performed as a single logical unit of work.
Transactions help maintain the integrity and consistency of a database, especially in multi-
user environments.
What is a Transaction? hp
Transaction control ensures data integrity.
A transaction is a group of SQL operations that are executed together, following the ACID
properties:
• Atomicity: All operations succeed or none do.
es
• Consistency: The database remains in a valid state before and after the transaction.
• Isolation: Transactions are isolated from each other.
• Durability: Once committed, changes are permanent even after a system failure.
Transaction Control Commands:
ur
BEGIN TRANSACTION;
D
43
NITK, Release DB COURSE PLAN-2024-25
Example:
START TRANSACTION;
UPDATE Accounts
SET Balance = Balance - 500
WHERE AccountID = 101;
UPDATE Accounts
SET Balance = Balance + 500
WHERE AccountID = 202;
h
COMMIT;
et
• Explanation: Money is transferred from Account 101 to 202.
• Impact: Both updates are saved permanently.
ROLLBACK:
ar
Role:
• Reverts the database to its state before the transaction started.
• Used to handle errors or invalid data during a transaction.
Example:
START TRANSACTION;
hp
UPDATE Accounts
SET Balance = Balance - 1000
es
WHERE AccountID = 101;
COMMIT;
END IF;
SAVEPOINT:
Role:
• Creates a checkpoint within a transaction.
• Allows partial rollbacks to specific points without rolling back the entire transaction.
D
Example:
START TRANSACTION;
INSERT INTO Orders (OrderID, Product, Quantity) VALUES (1, 'Laptop', 2);
SAVEPOINT sp1;
INSERT INTO Orders (OrderID, Product, Quantity) VALUES (2, 'Phone', 3);
SAVEPOINT sp2;
(continues on next page)
44
NITK, Release DB COURSE PLAN-2024-25
INSERT INTO Orders (OrderID, Product, Quantity) VALUES (3, 'Tablet', 1);
-- Simulate an error
ROLLBACK TO sp2;
COMMIT;
• Explanation:
h
– sp1 and sp2 are checkpoints.
– The insertion of the Tablet is rolled back, but Laptop and Phone are saved.
et
• Impact: Only the last operation (after sp2) is undone.
ROLLBACK TO SAVEPOINT:
Role:
ar
• Rolls back part of a transaction to a specific savepoint without affecting earlier opera-
tions.
Example:
START TRANSACTION;
hp
UPDATE Products SET Stock = Stock - 10 WHERE ProductID = 1;
SAVEPOINT sp1;
COMMIT;
ur
• Explanation:
– The change to ProductID = 2 is undone.
– The change to ProductID = 1 remains.
r.S
START TRANSACTION;
SAVEPOINT sp1;
45
NITK, Release DB COURSE PLAN-2024-25
Explanation:
• Step 1 & 2: Successful transactions, saved with savepoints.
• Step 3: Error occurs while updating an invalid account.
h
• Step 4: Rolled back to sp2, keeping previous updates intact.
• COMMIT: Finalizes the successful parts.
et
Comparison of Transaction Commands:
ar
back? Changes?
START TRANSACTION Begins a new transaction Yes No
COMMIT Saves all changes perma- No Yes
nently
ROLLBACK
SAVEPOINT
transaction
hp
Reverts all changes in the
Creates a checkpoint
Yes
No
point)
ROLLBACK TO SAVE- Rolls back to a specific save- Yes No
POINT point
es
RELEASE SAVEPOINT Deletes a savepoint No No
DELETE in Transactions:
Role:
• DELETE is a DML command that removes specific rows from a table.
• It fully supports transactions, meaning: - You can rollback a DELETE operation if it’s
D
START TRANSACTION;
46
NITK, Release DB COURSE PLAN-2024-25
h
• TRUNCATE is a DDL-like operation (although it affects data).
• Its behavior varies by database system:
et
Database System Transaction Support for TRUNCATE
PostgreSQL Supports rollback for TRUNCATE.
ar
MySQL (InnoDB) Does NOT support rollback. Auto-committed.
Oracle Cannot rollback. Auto-commits immediately.
START TRANSACTION;
DROP in Transactions
Role: - DROP is a DDL command that removes entire database objects (tables, views, etc.).
- Like TRUNCATE, its behavior depends on the DBMS:
r.S
START TRANSACTION;
COMMIT;
47
NITK, Release DB COURSE PLAN-2024-25
h
DROP DDL Structure + Data Depends on DBMS (Post- Auto-commit
greSQL: , MySQL: ) in MySQL
et
1. DELETE fully supports transactions: - Changes can be rolled back before committing. -
Works like other DML commands (INSERT, UPDATE).
2. TRUNCATE behaves like a DDL command: - Rollback supported in PostgreSQL. - Auto-
committed in MySQL and Oracle (cannot be rolled back).
ar
3. DROP is a DDL command: - Rollback supported in PostgreSQL. - Auto-committed in
MySQL and Oracle (cannot be rolled back).
4. Best Practice: - Always test transaction behaviors in your specific DBMS. - Use DELETE
when you need rollback capability. - Be cautious with TRUNCATE and DROP, especially in
hp
MySQL.
DELETE vs TRUNCATE vs DROP in SQL:
The key differences between DELETE, TRUNCATE, and DROP commands in SQL. These commands
are used for managing data and database objects, but they serve different purposes.
es
• DELETE: Removes specific records from a table based on a condition.
• TRUNCATE: Deletes all records from a table quickly without logging individual row dele-
tions.
• DROP: Removes the entire table, including its structure and data.
ur
DELETE Command:
Purpose:
The DELETE command is used to remove specific rows from a table using the WHERE clause.
r.S
Syntax:
Example:
D
48
NITK, Release DB COURSE PLAN-2024-25
h
TRUNCATE TABLE table_name;
et
Example:
ar
Key Points:
• Deletes all records from the table.
• Cannot be rolled back in some databases (DBMS-dependent).
hp
• Resets auto-increment counters.
• Faster than DELETE because it minimizes logging.
• Does not activate triggers in most databases.
When to Use?
es
• When you need to quickly delete all data while keeping the table structure intact.
DROP Command:
Purpose:
ur
The DROP command is used to remove database objects like tables, views, indexes, or entire
databases.
Syntax:
Example:
• Explanation: Completely removes the Employees table from the database, including
D
its structure.
Key Points:
• Deletes the entire table (structure + data).
• Cannot be rolled back in most databases.
• Removes all constraints, indexes, and triggers associated with the table.
When to Use?
• When you need to completely remove a table from the database.
49
NITK, Release DB COURSE PLAN-2024-25
h
Rollback Yes (if within a transac- Depends on DBMS No (cannot be rolled
Support tion) back)
Auto- No Yes Yes
et
Increment
Reset
Affects No No Yes (removes struc-
Structure? ture)
ar
Triggers Yes (triggers are fired) No (triggers are not No
fired)
Performance Slower for large Faster than DELETE Fastest (removes the
datasets object)
Use Case Delete specific records hp Remove all records, Completely remove
keep structure the table
Differences:
1. Data vs Structure:
es
• DELETE: Removes data but keeps the table structure.
• TRUNCATE: Removes all data but keeps the structure intact.
• DROP: Removes both the data and the structure.
2. Transaction Control:
ur
3. Performance:
• TRUNCATE is faster than DELETE for large datasets.
• DROP is the fastest as it completely removes the table.
Real-World Use Cases:
D
• Use Case: Periodically clear logs while keeping the table structure.
50
NITK, Release DB COURSE PLAN-2024-25
h
• Use TRUNCATE when you need to quickly delete all data from a table while keeping its
schema intact.
• Use DROP when you want to completely remove a table or database object.
et
ar
hp
es
ur
r.S
D
51
14
h
RDBMS vs. NoSQL
et
ar
Feature RDBMS NoSQL
Data Model Tables with relations Document, Key-Value, Graph
Schema Fixed schema Flexible schema
Query Language SQL hp NoSQL (varies by type)
Scalability Vertical Scaling Horizontal Scaling
Use Case OLTP (Transactional Apps) Big Data, Real-Time Apps
es
ur
r.S
D
52
15
h
SMART Health Management Database
et
ar
The Smart Health Management System (SHMS) revolutionizes traditional healthcare prac-
tices by leveraging advanced technologies to provide personalized, efficient, and accessible
healthcare services. It bridges the gap between patients and healthcare providers, ensuring
better health outcomes and quality of life in the digital era.
hp
Description: Develop a system to manage patient records, doctor schedules, and medical
inventory using a centralized database. Incorporate analytics to predict patient trends and
automate reminders for appointments and medication. Trends: Healthcare data analytics,
IoT integration for health monitoring.
The Smart Healthcare Management System will have the following key entities and rela-
es
tionships:
Entities:
• Patient
• Doctor
ur
• Appointment
• Medical_Record
• Medication
r.S
• Pharmacy
• Hospital_Staff
• Room
• Billing
D
ER Diagram Overview:
Here’s the breakdown of entities, attributes, and their relationships:
Patient
Attributes:
• Patient_ID (PK)
• Name
• Age
53
NITK, Release DB COURSE PLAN-2024-25
• Gender
• Contact
• Address
• Email
Relationships:
• Makes appointments with doctors
• Has medical records
h
Doctor
Attributes:
et
• Doctor_ID (PK)
• Name
• Specialization
ar
• Contact
• Email
• Room_Assigned hp
Relationships:
• Attends to patients through appointments
Appointment
Attributes:
es
• Appointment_ID (PK)
• Patient_ID (FK)
• Doctor_ID (FK)
ur
• Appointment_Date
• Time
• Status
r.S
Relationships:
• Links patients and doctors
Medical_Record
Attributes:
• Record_ID (PK)
D
• Patient_ID (FK)
• Doctor_ID (FK)
• Diagnosis
• Treatment
• Date
Relationships:
54
NITK, Release DB COURSE PLAN-2024-25
• Belongs to patients
• Created by doctors
Medication
Attributes:
• Medication_ID (PK)
• Name
• Type
h
• Dosage
• Side_Effects
et
Relationships:
• Prescribed to patients (via Medical_Record)
Pharmacy
ar
Attributes:
• Pharmacy_ID (PK)
• Name hp
• Location
• Contact
Relationships:
• Provides medications
es
Hospital_Staff
Attributes:
• Staff_ID (PK)
ur
• Name
• Role
• Contact
r.S
Relationships:
• Assigned to hospital operations
Room
Attributes:
• Room_ID (PK)
D
• Type
• Availability_Status
Relationships:
• Assigned to patients or doctors
Billing
Attributes:
55
NITK, Release DB COURSE PLAN-2024-25
• Bill_ID (PK)
• Patient_ID (FK)
• Amount
• Date
• Status
Relationships:
• Linked to patients
h
The following provides the SQL schema and sample data for a Smart Health Management
System. It includes table creation, inserting data, and maintaining relationships in MySQL.
et
Database Creation:
ar
Table Creation:
1. Patient Table
2. Doctor Table
3. Appointment Table
56
NITK, Release DB COURSE PLAN-2024-25
4. Medical_Record Table
h
Patient_ID INT NOT NULL,
Doctor_ID INT NOT NULL,
Diagnosis VARCHAR(255),
et
Treatment_Details TEXT,
Tests_Conducted TEXT,
Prescription_Details TEXT,
Record_Date DATE DEFAULT CURRENT_DATE,
FOREIGN KEY (Patient_ID) REFERENCES Patient(Patient_ID),
ar
FOREIGN KEY (Doctor_ID) REFERENCES Doctor(Doctor_ID)
);
5. Room Table
6. Hospital_Staff Table
7. Billing Table
57
NITK, Release DB COURSE PLAN-2024-25
8. Medication Table
h
Name VARCHAR(100) NOT NULL,
Description TEXT,
Dosage VARCHAR(50),
et
Manufacturer VARCHAR(100),
Expiry_Date DATE,
Price DECIMAL(10, 2) CHECK (Price >= 0)
);
ar
9. Pharmacy Table
This section provides sample data for each table in the Health Management System.
ur
INSERT INTO Patient (Name, Age, Gender, Address, Contact_Number, Email, Emergency_
,→Contact, Insurance_ID, Registration_Date)
VALUES
r.S
VALUES
('Dr. Alice Brown', 'Cardiology', 'MD', '1122334455', '[email protected]', 'Mon-
,→Fri 9AM-5PM', 15),
VALUES
(1, 1, '2025-01-05', '10:30:00', 'Scheduled', 'Routine Checkup'),
(2, 2, '2025-01-06', '11:00:00', 'Scheduled', 'Migraine Consultation');
h
VALUES
(1, 1, 'Hypertension', 'Monitor BP daily, reduce salt intake', 'Blood Pressure Test',
,→'Losartan 50mg daily', '2025-01-05'),
et
(2, 2, 'Migraine', 'Avoid triggers, prescribed medication', 'MRI Scan', 'Sumatriptan␣
,→25mg as needed', '2025-01-06');
ar
INSERT INTO Room (Room_Type, Room_Status, Daily_Rate, Floor_Number)
VALUES
('ICU', 'Available', 3000.00, 2),
('Private', 'Occupied', 2000.00, 3);
VALUES
('Mary Johnson', 'Nurse', '3344556677', '[email protected]', 'ICU', 'Night␣
es
,→Shift', 50000.00),
VALUES
(1, 1, 6000.00, 200.00, 500.00, 1000.00, 'Paid', 'Card'),
(2, 2, 0.00, 300.00, 800.00, 1500.00, 'Unpaid', 'Cash');
r.S
h
AI-Powered Recruitment System
et
ar
Problem Description:
The AI-Powered Recruitment System addresses the inefficiencies and biases in traditional
recruitment processes by leveraging artificial intelligence and predictive analytics. It pro-
vides a centralized platform for candidates and recruiters to interact effectively, streamlining
hp
the process of matching candidates with suitable jobs.
Design a database system for an AI-driven recruitment platform named AIDRecruite. The
system should manage candidates, job postings, applications, interviews, recruiters, and AI-
based job matching. The database should support:
1. Storing candidate details including multi-valued skills.
es
2. Managing job postings with multi-valued required skills.
3. Handling applications and tracking their statuses.
4. Scheduling and recording interviews.
ur
• Candidates can: - Register and create profiles. - Upload resumes and update skills.
- Track their applications.
2. Job Management
• Recruiters can: - Post job openings with detailed descriptions. - Define required
D
60
NITK, Release DB COURSE PLAN-2024-25
5. Interview Management
• Schedules and tracks: - Interviews for candidates.
• Records: - Feedback and outcomes for each round.
6. Recruiter Dashboard
• Offers tools to: - Manage job postings. - Review candidate matches. - Track appli-
cation statuses.
• Provides insights using analytics.
h
7. Candidate Dashboard
• Allows candidates to: - View job recommendations. - Track application statuses. -
et
Receive notifications.
Challenges Addressed:
1. Inefficient Matching - Reduces manual effort in screening candidates. - Identifies the
best matches quickly.
ar
2. Bias in Recruitment - Focuses on skills and experience. - Reduces subjective decision-
making.
3. Time-Consuming Processes - Automates screening and shortlisting. - Saves time for
hp
recruiters and candidates.
4. Application Overload - Manages large volumes of applications. - Ranks candidates
effectively.
Objective:
To design and implement an AI-Powered Recruitment System that:
es
1. Enhances Hiring Efficiency - Matches candidates to jobs with high accuracy using AI.
2. Improves Candidate Experience - Offers tailored job recommendations and real-time
tracking.
ur
61
NITK, Release DB COURSE PLAN-2024-25
Explanation:
h
• Candidate Table: Stores details about job candidates (Candidate_ID, Name, Skills).
• Job Table: Stores job listings (Job_ID, Title, Skills required).
et
• AI_Matching Table: Represents the many-to-many relationship between Candidate and
Job.
– Match_ID: Unique identifier for each match.
ar
– Candidate_ID: Foreign key referencing Candidate.
– Job_ID: Foreign key referencing Job.
– Match_Score: AI-generated score indicating suitability.
Relationships:
hp
• One Candidate can match with multiple Jobs (1:N).
• One Job can match with multiple Candidates (N:1).
• AI_Matching serves as a bridge table for many-to-many relationships.
es
This structured model ensures efficient job-candidate matching using AI-based scoring.
Explanation:
• Candidate Table: Stores details about job candidates, such as Candidate_ID, Name,
and Skills.
• Job Table: Stores job listings, including Job_ID, Title, and the required Skills.
D
• Application Table: Represents the relationship between candidates and jobs in the
application process.
– Application_ID: Unique identifier for each application.
– Candidate_ID: Foreign key referencing the Candidate table, indicating which can-
didate submitted the application.
– Job_ID: Foreign key referencing the Job table, indicating which job the candidate
applied for.
62
NITK, Release DB COURSE PLAN-2024-25
h
Candidates and Jobs, with additional details like application status and submission date.
Job-Recruiter Entity-Relationship (ER) Diagram
et
The following ER diagram illustrates the relationship between Job and Recruiter entities
using an associative entity called Job_Assignment to handle the many-to-many relationship.
ER Diagram:
ar
+-------------+ +----------------+ +-------------+
| Job | | Job_Assignment | | Recruiter |
|-------------| |----------------| |-------------|
| Job_ID | 1 N | Assignment_ID | N 1 | Recruiter_ID|
| Title |----------| Job_ID |----------| Name |
| Skills | Assigned | Recruiter_ID | Manages | Email
hp |
| Location | To | Assigned_Date | Jobs | Phone |
+-------------+ +----------------+ +-------------+
Entity Descriptions:
• Job
es
– Job_ID (Primary Key)
– Title: The title of the job position.
– Skills: Required skills for the job.
ur
63
NITK, Release DB COURSE PLAN-2024-25
h
| Name |----------| Candidate_ID(FK) |----------| Application_ID(FK)|
| Email | Applies | Job_ID(FK) | Applied | Interview_Date |
| Phone | For | Status | To | Interviewer |
et
| Skills | | Application_Date | | Feedback |
| Experience | +-------------------+ | Outcome |
+-----------------+ | N +-------------------+
M | a (1) |
c | h |
ar
e | s |
| (N) | 1
+-----------------+ +-----------------+ +-------------------+
| AI_Matching | | Job | | Job_Assignment |
|-----------------| |-----------------| |-------------------|
| Match_ID(PK) | N 1 | Job_ID(PK)
hp | 1 N | Assignment_ID(PK) |
| Candidate_ID(FK)|-----------| Title |----------| Job_ID(FK) |
| Job_ID (FK) | Matches | Skills | Assigned | Recruiter_ID(FK) |
| Match_Score | | Location | | Assigned_Date |
+-----------------+ | Salary | +-------------------+
+-----------------+ M | a (N)
n | a
es
g | e
s | (1)
+-----------------+
| Recruiter |
|-----------------|
ur
| Recruiter_ID(PK)|
| Name |
| Email |
| Phone |
+-----------------+
r.S
Entity Descriptions:
• Candidate
– Candidate_ID (Primary Key)
– Name: Full name of the candidate.
D
64
NITK, Release DB COURSE PLAN-2024-25
h
– Interviewer: Name of the interviewer.
– Feedback: Feedback from the interview.
et
– Outcome: Interview outcome (e.g., Pass, Fail).
• AI_Matching
– Match_ID (Primary Key)
ar
– Candidate_ID (Foreign Key referencing Candidate)
– Job_ID (Foreign Key referencing Job)
– Match_Score: Score representing the match strength between candidate and job.
hp
• Job
– Job_ID (Primary Key)
– Title: Job title.
– Skills: Required skills for the job.
es
– Location: Job location.
– Salary: Offered salary for the job.
• Job_Assignment
ur
65
NITK, Release DB COURSE PLAN-2024-25
• Candidate ↔ AI_Matching: A candidate can have multiple matches with different jobs
(1:N).
• Job ↔ AI_Matching: A job can have multiple candidate matches (1:N).
• Job ↔ Job_Assignment: A job can be assigned to multiple recruiters (1:N).
• Recruiter ↔ Job_Assignment: A recruiter can manage multiple jobs (1:N).
SQL Schema:
-- Job Table
h
CREATE TABLE Job (
Job_ID INT PRIMARY KEY,
Title VARCHAR(100),
et
Skills VARCHAR(255),
Location VARCHAR(100)
);
-- Recruiter Table
ar
CREATE TABLE Recruiter (
Recruiter_ID INT PRIMARY KEY,
Name VARCHAR(100),
Email VARCHAR(100),
Phone VARCHAR(15) hp
);
-- Job_Assignment Table
CREATE TABLE Job_Assignment (
Assignment_ID INT PRIMARY KEY,
Job_ID INT,
Recruiter_ID INT,
es
Assigned_Date DATE,
FOREIGN KEY (Job_ID) REFERENCES Job(Job_ID),
FOREIGN KEY (Recruiter_ID) REFERENCES Recruiter(Recruiter_ID)
);
ur
Sample Queries:
• Find Jobs Without a Recruiter:
SELECT J.*
FROM Job J
r.S
SELECT R.*
FROM Recruiter R
D
Example Workflow:
1. Candidate A uploads their resume with skills:
• Python
• Machine Learning
• Data Analysis
66
NITK, Release DB COURSE PLAN-2024-25
h
4. Outcome:
• Candidate A is ranked #1 for Job X.
et
• Candidate A is recommended to the recruiter.
5. Next Steps:
• The recruiter shortlists Candidate A.
ar
• Schedules an interview.
• Records feedback after the interview.
Database Schema: hp
CREATE DATABASE AIDRecruite;
USE AIDRecruite;
Candidate Table:
es
CREATE TABLE Candidate (
Candidate_ID INT AUTO_INCREMENT PRIMARY KEY,
Name VARCHAR(255) NOT NULL,
Email VARCHAR(255) UNIQUE NOT NULL,
Phone VARCHAR(20) UNIQUE NOT NULL,
Address TEXT,
ur
Resume_Link VARCHAR(255),
Experience_Years INT,
Profile_Creation_Date DATETIME DEFAULT CURRENT_TIMESTAMP
);
r.S
);
Job Table:
67
NITK, Release DB COURSE PLAN-2024-25
h
PRIMARY KEY (Job_ID, Skill),
FOREIGN KEY (Job_ID) REFERENCES Job(Job_ID) ON DELETE CASCADE
);
et
Application Table:
ar
Candidate_ID INT,
Job_ID INT,
Application_Status ENUM('Applied', 'Shortlisted', 'Rejected', 'Hired') DEFAULT
,→'Applied',
);
hp
FOREIGN KEY (Candidate_ID) REFERENCES Candidate(Candidate_ID) ON DELETE CASCADE,
FOREIGN KEY (Job_ID) REFERENCES Job(Job_ID) ON DELETE CASCADE
Interview Table:
);
Recruiter Table:
r.S
);
68
NITK, Release DB COURSE PLAN-2024-25
AI Matching Table:
CREATE TABLE AI_Matching (
Match_ID INT AUTO_INCREMENT PRIMARY KEY,
Candidate_ID INT,
Job_ID INT,
Match_Score DECIMAL(5,2) CHECK (Match_Score >= 0 AND Match_Score <= 100),
FOREIGN KEY (Candidate_ID) REFERENCES Candidate(Candidate_ID) ON DELETE CASCADE,
FOREIGN KEY (Job_ID) REFERENCES Job(Job_ID) ON DELETE CASCADE
);
h
16.1 Data Insertion
et
INSERT INTO Candidate (Name, Email, Phone, Address, Resume_Link, Experience_Years)␣
,→VALUES
ar
,→5),
('Software Engineer', 'Develop and maintain software applications.', 3, 'New York', '
,→$70,000-$90,000'),
h
et
ar
hp
es
ur
r.S
D
h
Relational Algebra
et
ar
Illustration:
1. Only in A (not in J or C)
Query Statement:
hp
Find all applications that do not have a matching job or candidate.
Relational Algebra:
Relational Algebra:
J − (A∪C) = πJobI D,T itle,Description,RequiredS kills,ExperienceR equired,Location,SalaryR ange,P ostingD ate (Job)
\(πJobI D (Job ▷◁ Application)∪πJobI D (Job ▷◁ Candidate))
r.S
C − (A∪J) = πCandidateI D,N ame,Email,P hone,Address,ResumeL ink,Skills,ExperienceY ears,P rof ileC reationD ate
(Candidate) \ (πCandidateI D (Candidate ▷◁ Application)∪πCandidateI D (Candidate ▷◁ Job))
71
NITK, Release DB COURSE PLAN-2024-25
Relational Algebra:
h
Relational Algebra:
et
6. In J and C, but not in A:
Query Statement:
ar
Find all jobs that have a candidate associated but no application exists for them.
Relational Algebra:
(J∩C) − A = (πJobI D,CandidateI D,T itle,Description,RequiredS kills,ExperienceR equired,Location,SalaryR ange,P ostingD ate
hp (Job ▷◁ Candidate)) \ πJobI D (Job ▷◁ Application)
A∩J∩C = πApplicationI D,CandidateI D,JobI D,ApplicationS tatus,ApplicationD ate (Application ▷◁ Job ▷◁ Candidate)
Query Statement:
Find all records that exist in at least one of the Application, Job, or Candidate tables.
Relational Algebra:
r.S
72
NITK, Release DB COURSE PLAN-2024-25
Table of Contents
• Practical scenario
• Relational Algebra Queries and SQL Equivalents
h
1. Only in A (not in J or C):
et
Relational Algebra: A - (J ∪ C)
Description: Find all applications that do not have a matching job or candidate.
ar
WHERE A.Candidate_ID NOT IN (SELECT Candidate_ID FROM Candidate)
AND A.Job_ID NOT IN (SELECT Job_ID FROM Job);
Relational Algebra: (A ∩ J) - C
Description: Find all applications that have a corresponding job but no corresponding can-
didate.
SELECT *
FROM Application A
WHERE A.Job_ID IN (SELECT Job_ID FROM Job)
D
73
NITK, Release DB COURSE PLAN-2024-25
SELECT *
FROM Application A
WHERE A.Candidate_ID IN (SELECT Candidate_ID FROM Candidate)
AND A.Job_ID NOT IN (SELECT Job_ID FROM Job);
h
SELECT *
FROM Job J
WHERE J.Job_ID NOT IN (
et
SELECT Job_ID FROM Application)
AND J.Job_ID IN (
SELECT A.Job_ID
FROM Application A
ar
WHERE A.Candidate_ID IN (SELECT Candidate_ID FROM Candidate));
Or equivalently
SELECT * hp
FROM Job J
WHERE EXISTS (
SELECT 1 FROM Application A
WHERE A.Job_ID = J.Job_ID
AND A.Candidate_ID IN (SELECT Candidate_ID FROM Candidate)
)
es
AND NOT EXISTS (
SELECT 1 FROM Application A
WHERE A.Job_ID = J.Job_ID
);
Relational Algebra: A ∩ J ∩ C
Description: Find all applications where a candidate has applied for a job, meaning there is
a connection between all three entities.
r.S
SELECT A.*
FROM Application A
JOIN Job J ON A.Job_ID = J.Job_ID
JOIN Candidate C ON A.Candidate_ID = C.Candidate_ID;
Or equivalently:
D
SELECT *
FROM Application A, Job J, Candidate C
WHERE A.Job_ID = J.Job_ID
AND A.Candidate_ID = C.Candidate_ID;
74
NITK, Release DB COURSE PLAN-2024-25
Relational Algebra: A ∪ J ∪ C
Description: Find all records that exist in at least one of the Application, Job, or Candidate
tables.
FROM Candidate
h
UNION
SELECT NULL, Job_ID, NULL, Title, 'Job'
FROM Job
et
UNION
SELECT Candidate_ID, Job_ID, Application_ID, Application_Status, 'Application'
FROM Application;
ar
Find all jobs where a candidate is linked, but no actual application has been submitted.
Breaking it Down
(J ∩ C) → Jobs with associated candidates hp
• This means there is some logical or inferred connection between jobs and candidates.
• However, this doesn’t necessarily mean an application was submitted.
• This could be from AI-based matching, recruiter assignments, or candidate-job recom-
mendations.
es
- A → Remove jobs where applications exist
• If an actual application exists for a job, remove that job from the result.
• The remaining jobs are those where a candidate is associated but hasn’t applied.
ur
Real-Life Example:
Scenario:
A company has a job posting for a Software Engineer. They use an AI system that predicts a
match between candidates and jobs based on skills and experience.
r.S
Possible Situations:
This document provides relational algebra expressions and their equivalent SQL queries for
different database operations involving Application (A), Job (J), and Candidate (C).
Each section includes:
• Relational Algebra Expression
• SQL Query
h
• Example Scenario
• Tabular Representation of Example Data
et
1. Only in A (not in J or C)
Relational Algebra:
A − (J∪C)
ar
Description:
Find all applications that do not have a matching job or candidate.
Example Scenario:
A job application was submitted for a job that does not exist in the system or for a candidate
es
who is not registered.
2. Only in J (not in A or C)
Relational Algebra:
J − (A∪C)
r.S
Description:
Find all jobs that are not associated with any application or candidate.
Example Scenario:
A company has posted job listings, but no one has applied, and no candidate is associated.
(A∩C) − J
Description:
Find all applications where a candidate exists but no corresponding job exists.
SELECT *
h
FROM Application A
WHERE A.Candidate_ID IN (SELECT Candidate_ID FROM Candidate)
AND A.Job_ID NOT IN (SELECT Job_ID FROM Job);
et
Example Scenario:
A candidate submitted an application but the job was deleted.
ar
Application_ID Candidate_ID Job_ID Status
102 201 NULL Submitted
Description:
Find all applications that have a corresponding job but no corresponding candidate.
es
SELECT *
FROM Application A
WHERE A.Job_ID IN (SELECT Job_ID FROM Job)
AND A.Candidate_ID NOT IN (SELECT Candidate_ID FROM Candidate);
ur
Example Scenario:
A job application was created but the candidate was deleted from the system.
r.S
(J∩C) − A
Description:
Find all jobs where a candidate is linked, but no actual application has been submitted.
SELECT *
FROM Job J
WHERE J.Job_ID IN (SELECT A.Job_ID FROM Application A WHERE A.Candidate_ID IN (SELECT␣
,→Candidate_ID FROM Candidate))
Example Scenario:
A candidate was recommended for a job by an AI system or recruiter but did not submit
an application.
h
6. Find all records existing in at least one of the three tables
Relational Algebra:
et
A∪J∪C
Description:
Find all records that exist in at least one of the Application, Job, or Candidate tables.
ar
SELECT Candidate_ID, NULL AS Job_ID, NULL AS Application_ID, Name AS Entity_Name,
,→'Candidate' AS Entity_Type
FROM Candidate
UNION hp
SELECT NULL, Job_ID, NULL, Title, 'Job'
FROM Job
UNION
SELECT Candidate_ID, Job_ID, Application_ID, Application_Status, 'Application'
FROM Application;
es
Example Scenario:
List all records that exist in any of the three tables.
Relational Algebra:
A∩J∩C
Description:
D
Find all applications where a candidate has applied for a job, meaning there is a connection
between all three entities.
SELECT A.*
FROM Application A
JOIN Job J ON A.Job_ID = J.Job_ID
JOIN Candidate C ON A.Candidate_ID = C.Candidate_ID;
Example Scenario:
A valid application exists where both a job and a candidate are present.
h
et
ar
hp
es
ur
r.S
D
h
Cross Product and Join in RDBMS
et
ar
In Relational Database Management Systems (RDBMS), cross product and joins define
how tables are combined. Below is a detailed explanation of each, with examples.
Cross Product (Cartesian Product):
Definition: hp
A cross product (Cartesian Product) is the combination of every row from the
first table with every row from the second table. It generates a result set that
has m × n rows, where:
• m = number of rows in the first table
es
• n = number of rows in the second table
SQL Syntax::
Example:
Table: Employees
r.S
emp_id emp_name
12 Alice Bob
Table: Departments
D
dept_id dept_name
10 20 HR IT
80
NITK, Release DB COURSE PLAN-2024-25
h
1. INNER JOIN
2. OUTER JOIN
et
• LEFT JOIN (LEFT OUTER JOIN)
• RIGHT JOIN (RIGHT OUTER JOIN)
• FULL JOIN (FULL OUTER JOIN)
ar
3. SELF JOIN
4. CROSS JOIN (Same as Cross Product)
INNER JOIN: hp
An INNER JOIN returns only the matching rows from both tables where there is a common
value.
SQL Syntax::
Example:
Table: emp_dept
ur
emp_id dept_id
12 10 20
r.S
Result:
OUTER JOINs:
LEFT JOIN (LEFT OUTER JOIN)
81
NITK, Release DB COURSE PLAN-2024-25
• Returns all rows from the left table and matching rows from the right table.
• If there is no match, NULL is returned for the right table.
SQL Syntax::
SELECT e.emp_id, e.emp_name, d.dept_id, d.dept_name
FROM Employees e
LEFT JOIN emp_dept ed ON e.emp_id = ed.emp_id
LEFT JOIN Departments d ON ed.dept_id = d.dept_id;
h
Example Result:
et
123 Alice Bob Charlie 10 20 NULL HR IT NULL
ar
• Returns all rows from the right table and matching rows from the left table.
• If there is no match, NULL is returned for the left table.
FULL JOIN (FULL OUTER JOIN)
hp
• Returns all records from both tables, with NULL where there is no match.
SELF JOIN:
A SELF JOIN is when a table is joined with itself.
SQL Syntax::
es
SELECT e1.emp_name AS Employee, e2.emp_name AS Manager
FROM Employees e1
JOIN Employees e2 ON e1.manager_id = e2.emp_id;
Example Output:
82
NITK, Release DB COURSE PLAN-2024-25
h
et
ar
hp
es
ur
r.S
D
83
19
h
JOIN Operations in AI Matching Recruitment System
et
ar
This document explains the relevance of JOIN operations in the AI-Driven Recruitment
System, focusing on how they are used to manage candidate applications, AI matching, job
assignments, and recruiter interactions.
JOIN Types Overview: hp
• INNER JOIN: Returns records with matching values in both tables.
• LEFT JOIN (LEFT OUTER JOIN): Returns all records from the left table and matched
records from the right table; NULL if no match exists.
• RIGHT JOIN (RIGHT OUTER JOIN): Returns all records from the right table and
es
matched records from the left table; NULL if no match exists.
• FULL JOIN (FULL OUTER JOIN): Returns all records when there is a match in either
left or right table.
• SELF JOIN: Joins a table with itself to compare rows within the same table.
ur
84
NITK, Release DB COURSE PLAN-2024-25
h
Relevance:
• To find candidates who haven’t applied for any jobs.
et
• To identify jobs with no assigned recruiters.
• To detect applications without any interviews scheduled.
Example 1: Find Candidates Who Have NOT Applied for Any Job
ar
SELECT C.Name
FROM Candidate C
LEFT JOIN Application A ON C.Candidate_ID = A.Candidate_ID
WHERE A.Application_ID IS NULL; hp
Example 2: Find Jobs WITHOUT Assigned Recruiters
SELECT J.Title
FROM Job J
LEFT JOIN Job_Assignment JA ON J.Job_ID = JA.Job_ID
WHERE JA.Recruiter_ID IS NULL;
es
Relevance: - To find recruiters who are not assigned to any jobs. - To identify unmatched AI
records (e.g., jobs without matched candidates).
Example 1: Find Recruiters WITHOUT Any Job Assignments
r.S
SELECT R.Name
FROM Job_Assignment JA
RIGHT JOIN Recruiter R ON JA.Recruiter_ID = R.Recruiter_ID
WHERE JA.Job_ID IS NULL;
Definition: Returns all records when there is a match in either left or right table. Records
without matches will have NULL values.
> Note: MySQL doesn’t support FULL JOIN directly. Use UNION with LEFT JOIN and RIGHT
JOIN.
Relevance: - To compare all candidates and applications, including those without any con-
nections. - To find jobs and recruiters, including those with or without job assignments.
Example 1: List All Candidates and Their Applications (Even If No Match Exists)
85
NITK, Release DB COURSE PLAN-2024-25
h
SELECT C.Name, A.Application_ID
FROM Candidate C
RIGHT JOIN Application A ON C.Candidate_ID = A.Candidate_ID;
et
SELF JOIN:
Definition: Joins a table with itself to compare rows within the same table.
Relevance:
ar
• To find candidates with similar skill sets.
• To compare jobs with overlapping skills or salary ranges.
Example 1: Find Candidates with Matching Skills
hp
SELECT A.Name AS Candidate1, B.Name AS Candidate2
FROM Candidate A
INNER JOIN Candidate B ON A.Skills = B.Skills
WHERE A.Candidate_ID <> B.Candidate_ID;
es
CROSS JOIN:
Definition: Returns the Cartesian product of both tables.
Relevance:
• To generate all possible combinations of candidates and jobs (e.g., for AI Matching algo-
ur
rithms).
• To simulate scenarios where all candidates are matched with all jobs before applying
filters.
Example 1: Generate All Candidate-Job Combinations (For AI Matching)
r.S
86
NITK, Release DB COURSE PLAN-2024-25
h
FULL Get all records from both tables, with All candidates and applications
JOIN NULLs where no match exists (with or without match)
SELF Compare records within the same table Candidates with similar skills
et
JOIN
CROSS Cartesian product of both tables All candidate-job combinations for
JOIN AI Matching
ar
• INNER JOIN and LEFT JOIN are the most commonly used in the recruitment system.
• CROSS JOIN is valuable when designing AI Matching models.
• SELF JOIN helps identify internal patterns (e.g., similar candidates or jobs).
hp
es
ur
r.S
D
87
20
h
AI Matching Recruitment System with Conventional
et
DBMS Models
ar
The AI Matching Recruitment System can be represented using various conventional
DBMS models. Each model describes how data is organized, related, and managed in the
recruitment workflow. hp
• Hierarchical Model
• Network Model
• Relational Model (RDBMS)
• Object-Oriented Model (OODBMS)
es
• Entity-Relationship Model (ER Model)
• Document-Oriented Model (NoSQL)
• Key-Value Model (NoSQL)
ur
Representation:
Company
└── Recruiters
└── Jobs
└── Applications
D
└── Candidates
└── AI_Matching
└── Interviews
Key Points:
• Each Recruiter manages multiple Jobs.
• Applications are linked to Candidates.
• AI_Matching connects candidates to jobs.
88
NITK, Release DB COURSE PLAN-2024-25
h
Structure:
Organizes data as a graph allowing many-to-many relationships.
et
Representation:
ar
Matches Assigned
↓ ↓
(AI_Matching) <--- Managed By ---> (Recruiter)
↑
Interviewed In hp
↓
(Interview)
Advantages:
• Handles complex M:N relationships efficiently.
es
• Flexible data connections.
Disadvantages:
• Complex navigation paths.
• Hard to manage schema changes.
ur
89
NITK, Release DB COURSE PLAN-2024-25
h
• Performance issues with large-scale complex joins.
Object-Oriented Model (OODBMS):
et
Structure:
Represents data as objects, similar to OOP languages.
Example:
ar
class Candidate:
def __init__(self, candidate_id, name, skills, experience):
self.candidate_id = candidate_id
self.name = name hp
self.skills = skills
self.experience = experience
class Job:
def __init__(self, job_id, title, skills, salary):
self.job_id = job_id
es
self.title = title
self.skills = skills
self.salary = salary
class Application:
def __init__(self, app_id, candidate, job, status):
ur
self.app_id = app_id
self.candidate = candidate
self.job = job
self.status = status
r.S
Advantages:
• Handles complex data structures naturally.
• Seamless integration with OOP languages.
Disadvantages:
D
90
NITK, Release DB COURSE PLAN-2024-25
h
Advantages:
• Excellent for conceptual database design.
et
• Clear visualization of data relationships.
Disadvantages:
• Not directly implemented; requires conversion to RDBMS.
ar
Document-Oriented Model (NoSQL):
Structure:
Stores data as documents (JSON, BSON) for flexibility.
Example (MongoDB):
{
hp
"Candidate_ID": "C1",
"Name": "Alice",
"Skills": ["Python", "Machine Learning"],
"Applications": [
es
{
"Job_ID": "J1",
"Status": "Applied",
"AI_Matching": {
"Match_Score": 85
ur
},
"Interviews": [
{"Date": "2024-05-01", "Outcome": "Pass"}
]
}
]
r.S
Advantages:
• High scalability and flexibility.
• Schema-less design supports dynamic data.
D
Disadvantages:
• Complex querying compared to SQL.
• Limited transactional support.
Key-Value Model (NoSQL):
Structure:
Stores data as key-value pairs, suitable for fast lookups.
91
NITK, Release DB COURSE PLAN-2024-25
Example (Redis):
Advantages:
• Extremely fast for simple key-based queries.
• Scalable for real-time applications.
h
Disadvantages:
• Limited support for complex queries.
et
• No relational capabilities.
Graph Model (NoSQL):
Structure:
ar
Represents data as nodes (entities) and edges (relationships).
Example (Neo4j):
(Alice)-[:APPLIED_FOR]->(Job:Data_Scientist)
hp
(Alice)-[:MATCHED_WITH {score: 85}]->(Job:Data_Scientist)
(Recruiter:John)-[:MANAGES]->(Job:Data_Scientist)
Advantages: - Efficient for complex relationship queries. - Fast graph traversals (e.g., rec-
ommendations).
Disadvantages: - Requires specialized query languages (e.g., Cypher). - Overhead for simple
es
data models.
Comparison of DBMS Models:
Hierarchical Model Simple parent-child Fast retrieval for hi- Rigid structure
workflows erarchical data
Network Model Complex many-to- Efficient M:N han- Complex navigation
many relationships dling
r.S
Relational Model Structured data, SQL Strong data in- Performance issues
(RDBMS) queries tegrity (ACID) with large joins
Object-Oriented Complex data, OOP Natural fit with Complex queries
Model integration OOP languages
Entity-Relationship Conceptual database Clear visualization Requires conversion
Model design to RDBMS
D
• Relational Models (RDBMS) are ideal for structured data and complex queries.
• Graph Models excel in relationship-heavy applications (e.g., AI Matching).
92
NITK, Release DB COURSE PLAN-2024-25
• NoSQL Models like Document-Oriented and Key-Value are great for dynamic, real-
time data.
• The choice of the model depends on the specific needs of the recruitment system.
h
et
ar
hp
es
ur
r.S
D
93
21
h
Server Hierarchy
et
ar
Server
└── Database
├── Schema
│ ├── Tables
│ │ ├── Columns
│
│
│
│
├── Rows
└── Keys
hp
│ │ ├── Primary
│ │ └── Foreign
│ ├── Views
es
│ ├── Indexes
│ ├── Stored Procedures
│ └── Triggers
ur
r.S
D
94
22
h
Functional Dependency in RDBMS
et
ar
What is a Functional Dependency?
A Functional Dependency (FD) is a relationship between two sets of attributes in a relation
(table) of a relational database.
Formally, if we say: hp
X → Y
22.1 Example
95
NITK, Release DB COURSE PLAN-2024-25
X → Y
is trivial.
Example:
h
2. Non-Trivial Functional Dependency
If Y is not a subset of X.
et
Example:
StudentID → Name
ar
3. Full Functional Dependency
Y is fully dependent on X and not on any subset of X.
Example:
StudentID → Email
4. Partial Dependency
hp
Y depends on part of a composite key X.
Example: If (CourseID, StudentID) is the primary key, and:
es
StudentID → StudentName
If:
A → B and B → C
r.S
Then:
A → C
Example:
⇒ StudentID → DepartmentName
22.1. Example 96
NITK, Release DB COURSE PLAN-2024-25
2. Determining Keys
• Help define candidate keys, primary keys, and super keys
3. Data Integrity
• Ensure consistency and correctness of data
Armstrong’s Axioms
Used to infer all possible functional dependencies.
h
Rule Description
Reflexivity If Y ⊆ X, then X → Y
et
Augmentation If X → Y, then XZ → YZ
Transitivity If X → Y and Y → Z, then X → Z
Extended rules:
ar
• Union: If X → Y and X → Z, then X → YZ
• Decomposition: If X → YZ, then X → Y and X → Z
• Pseudotransitivity: If X → Y and YZ → W, then XZ → W
hp
22.2 Practical Use Case
StudentID → Email
StudentID → Name
ur
Then:
• Avoid storing Email or Name redundantly in other tables.
• Join on StudentID when necessary.
r.S
h
Functional Dependencies and Normal Forms
et
ar
Functional dependencies play a key role in identifying redundancy and guiding table decom-
position during normalization in relational database design. Here’s how they relate to normal
forms: 1NF, 2NF, 3NF, and BCNF.
Rule: hp
• Every attribute must contain atomic (indivisible) values.
• No repeating groups or arrays.
Functional Dependency Role:
• Functional dependencies are not deeply involved yet, but 1NF is the necessary starting
es
point.
Example:
-- Violates 1NF:
CREATE TABLE Students (
ur
StudentID INT,
Name VARCHAR(100),
Courses VARCHAR(255) -- e.g., "Math, Physics"
);
r.S
Fix:
Rule:
• Must be in 1NF.
• No partial dependencies: every non-prime attribute must be fully functionally depen-
dent on the entire primary key.
Functional Dependency Role:
• Detect and eliminate partial functional dependencies.
98
NITK, Release DB COURSE PLAN-2024-25
Example:
h
Functional dependencies:
• StudentID → StudentName
et
• CourseID → CourseName
Violates 2NF because StudentName depends only on StudentID.
Fix:
ar
CREATE TABLE Students (
StudentID INT PRIMARY KEY,
StudentName VARCHAR(100)
);
Rule:
ur
• Must be in 2NF.
• No transitive dependencies: non-key attributes should not depend on other non-key
attributes.
Functional Dependency Role:
r.S
EmpName VARCHAR(100),
DeptID INT,
DeptName VARCHAR(100)
);
Functional dependencies:
• EmpID → DeptID
• DeptID → DeptName
• Therefore: EmpID → DeptName (transitive)
99
NITK, Release DB COURSE PLAN-2024-25
Fix:
h
);
et
Normal FD Issue Elimi- Requirement
Form nated
1NF Non-atomic values Atomic values only
2NF Partial dependency Full dependency on the entire primary key
ar
3NF Transitive depen- No non-key attributes depending on other non-key at-
dency tributes
hp
es
ur
r.S
D
100
24
h
Normalization in Relational Databases
et
ar
Normalization is the process of organizing data in a relational database to:
• Reduce redundancy
• Prevent update, insertion, and deletion anomalies
• Ensure data integrity
hp
It involves breaking down large tables into smaller, related tables and connecting them via
foreign keys.
Without normalization, databases often suffer from:
• Insertion anomalies – You cannot insert a value without supplying other unnecessary
es
data.
• Update anomalies – Updating data in one place but forgetting to update it elsewhere.
• Deletion anomalies – Deleting a record may remove important data unintentionally.
ur
Example:
-- Before Normalization:
| StudentID | Name | Course1 | Course2 |
|-----------|--------|---------|---------|
| 101 | Alice | Math | Physics |
r.S
Issues: - Courses are stored in multiple columns (repeating group). - Hard to scale, search,
or maintain.
After Normalization (1NF):
|-----------|--------|---------|
| 101 | Alice | Math |
| 101 | Alice | Physics |
Each normal form (NF) builds upon the previous one to further reduce redundancy.
Goal: Eliminate repeating groups and ensure atomicity.
• All values in columns must be atomic (indivisible).
• No arrays, lists, or nested tables.
101
NITK, Release DB COURSE PLAN-2024-25
Bad:
Good:
h
| 101 | Alice | Physics |
et
• Must be in 1NF.
• Every non-prime attribute must be fully functionally dependent on the whole primary
key.
ar
Bad:
Here:
• EmpID → DeptID
r.S
• DeptID → DeptName
• Therefore: EmpID → DeptName (transitive)
Fix:
Functional dependency:
102
NITK, Release DB COURSE PLAN-2024-25
h
RoomNumber INT,
PRIMARY KEY (CourseID, RoomNumber)
);
et
• Minimizes redundancy
• Improves data consistency
• Simplifies maintenance
ar
• Enhances data integrity
• Saves storage space
• Denormalization may be preferred for performance (e.g., analytics, reporting).
hp
• Useful when minimizing joins and optimizing read-heavy queries.
103
25
h
SQL-Based Normalization Walkthrough
et
ar
Scenario: Student Enrollment System
We begin with a denormalized table containing student and course data.
Unnormalized Table hp
CREATE TABLE StudentEnrollment (
StudentID INT,
StudentName VARCHAR(100),
Course1 VARCHAR(100),
Course2 VARCHAR(100),
Instructor1 VARCHAR(100),
es
Instructor2 VARCHAR(100)
);
Sample Data:
ur
Course VARCHAR(100),
Instructor VARCHAR(100)
);
Transformed Data:
104
NITK, Release DB COURSE PLAN-2024-25
h
StudentName VARCHAR(100)
);
et
CREATE TABLE StudentCourses (
StudentID INT,
Course VARCHAR(100),
Instructor VARCHAR(100),
PRIMARY KEY (StudentID, Course),
ar
FOREIGN KEY (StudentID) REFERENCES Students(StudentID)
);
Course VARCHAR(100),
PRIMARY KEY (StudentID, Course),
FOREIGN KEY (StudentID) REFERENCES Students(StudentID),
FOREIGN KEY (Course) REFERENCES Courses(Course)
);
r.S
-- Students table
CREATE TABLE Students (
StudentID INT PRIMARY KEY,
StudentName VARCHAR(100)
D
);
-- Courses table
CREATE TABLE Courses (
Course VARCHAR(100) PRIMARY KEY,
Instructor VARCHAR(100)
);
-- Enrollment table
CREATE TABLE StudentCourses (
(continues on next page)
105
NITK, Release DB COURSE PLAN-2024-25
h
Instructor VARCHAR(100) PRIMARY KEY,
Course VARCHAR(100)
);
et
-- Course-Location assignment (optional BCNF step)
CREATE TABLE CourseAssignments (
Course VARCHAR(100),
Room VARCHAR(50),
ar
PRIMARY KEY (Course, Room),
FOREIGN KEY (Course) REFERENCES Courses(Course)
);
hp
es
ur
r.S
D
106
26
h
SQL Inserts, Joins, and Updates Without Anomalies
et
ar
Insert Sample Data
Students
StudentCourses
ur
SELECT
s.StudentID,
D
s.StudentName,
sc.Course,
c.Instructor
FROM
Students s
JOIN StudentCourses sc ON s.StudentID = sc.StudentID
JOIN Courses c ON sc.Course = c.Course;
Sample Output:
107
NITK, Release DB COURSE PLAN-2024-25
h
To update instructor info, we’d need to find and update every record manually.
Now:
et
Update once in the Courses table.
UPDATE Courses
SET Instructor = 'Dr. Albert Smith'
ar
WHERE Course = 'Math';
SELECT
s.StudentID, hp
s.StudentName,
sc.Course,
c.Instructor
FROM
Students s
JOIN StudentCourses sc ON s.StudentID = sc.StudentID
es
JOIN Courses c ON sc.Course = c.Course;
Updated Output:
|-----------|-------------|-----------|------------------|
| 101 | Alice | Math | Dr. Albert Smith |
| 101 | Alice | Physics | Dr. Brown |
| 102 | Bob | Chemistry | Dr. Green |
| 103 | Charlie | Math | Dr. Albert Smith |
r.S
D
108
27
h
SQL DELETE and Cascading Relationships
et
ar
Cascading Deletes with ON DELETE CASCADE
In a normalized database, if we delete a record (e.g., a student or a course), we often want
related data to be deleted automatically to avoid orphaned records.
hp
For example, if a student is removed, all their enrollments should be removed too. Similarly,
deleting a course should remove any course assignments.
Step 1: Define Foreign Key with ON DELETE CASCADE
Here’s how to define cascading deletes when setting up the foreign key relationship:
Course VARCHAR(100),
PRIMARY KEY (StudentID, Course),
FOREIGN KEY (StudentID) REFERENCES Students(StudentID) ON DELETE CASCADE,
FOREIGN KEY (Course) REFERENCES Courses(Course) ON DELETE CASCADE
);
D
109
NITK, Release DB COURSE PLAN-2024-25
Result:
The student with ID 101 (Alice) and all records associated with her in the StudentCourses
table will be automatically deleted.
Step 3: Delete a Course and Automatically Remove All Enrollments and Assignments
Similarly, let’s delete a course and have its assignments and student enrollments removed
h
automatically.
et
WHERE Course = 'Math';
Result:
All records related to the course “Math” in both the StudentCourses and CourseAssignments
ar
tables will be deleted.
Additional Notes on CASCADE
When you define ON DELETE CASCADE, be cautious as it can lead to unintentional data loss
if the delete operation is not carefully controlled. Always ensure that cascading deletes are
hp
applied only to tables where it makes sense (e.g., child records related to a parent record).
You can also apply ON UPDATE CASCADE to propagate changes in primary key values across
related tables, but it’s less common than cascading deletes.
Testing Cascading Deletes
es
You can simulate cascading deletes by running the following tests:
1. Insert new data:
110
28
h
Lossless and Lossy Joins in RDBMS
et
ar
Definition of Join
A join is an operation to combine two or more relations based on a common attribute.
Lossless Join
hp
A lossless join ensures that decomposing and then rejoining relations does not result
in any loss or addition of data.
• Maintains original data integrity
• No spurious tuples created
es
• At least one common attribute must be a key
Example:
-- Original
Student(id, name, dept)
ur
-- Decomposed
Student1(id, name)
Student2(id, dept)
-- Rejoined
r.S
A lossy join occurs when rejoining decomposed tables introduces incorrect or extra rows.
• Does not preserve original relation exactly
• May occur when common attribute is not a key
Example:
-- Original
R(A, B, C)
111
NITK, Release DB COURSE PLAN-2024-25
-- Rejoin
SELECT * FROM R1
JOIN R2 USING(B);
h
Lossless Join Test
A decomposition of R into R1 and R2 is lossless if:
et
(R1 ∩ R2) → R1 or (R1 ∩ R2) → R2
ar
Table 1: Comparison Table
Join Type Description
Lossless No spurious tuples, maintains original data
hp
Lossy Extra tuples added, incorrect reconstruction
es
ur
r.S
D
112
29
h
Checking for Lossless vs Lossy Join in RDBMS
et
ar
Lossless Join Detection Algorithm:
This algorithm checks whether a decomposition of a relation is lossless or lossy using a
tabular method and functional dependencies.
Inputs: hp
• Relation R(A1, A2, …, An)
• Decomposition: R1, R2, …, Rk
• Functional dependencies (FDs)
es
Algorithm Steps
1. Create a table:
• Rows = decomposed relations (R1, R2, …, Rk)
• Columns = attributes of R
ur
2. Initialize values:
• If attribute Ai is in Ri → mark as aᵢ
• Else → mark as unique symbol bᵢ
r.S
Relation: R(A, B, C)
Decomposition: R1(A, B), R2(B, C)
FDs: A → B, C → A
Initial Table:
(continues on next page)
113
NITK, Release DB COURSE PLAN-2024-25
| Row | A | B | C |
|-----|-----|-----|-----|
| R1 | a1 | a2 | b3 |
| R2 | b1 | a2 | a3 |
Apply FDs:
- C → A → changes b1 to a1 in R2
h
- No row is fully 'aᵢ' → Lossy
Remarks:
et
A decomposition is lossless if and only if we can propagate values such that at least one row
contains only original aᵢ values.
When decomposing a relation into two or more sub-relations, it’s important to ensure that
the original relation can be perfectly reconstructed using a natural join. This is known as a
ar
lossless join. If reconstruction introduces incorrect or duplicate tuples, the decomposition
is lossy.
Lossless Join Condition
(R1 ∩ R2) → R1 or
hp
For a decomposition of a relation R into R1 and R2, the join is lossless if:
(R1 ∩ R2) → R2
This means the common attribute(s) must functionally determine at least one of the rela-
tions.
es
Illustrated Example
ur
R(A, B, C)
• Decomposed Relations:
– R1(A, B)
– R2(B, C)
D
• Join operation: R1 R2
Outcome:
• The join is lossless if B → A or B → C holds (i.e., B is a key in either relation).
• The join is lossy if B does not functionally determine the other attributes.
Use Case in AI-Driven Recruitment
Suppose you have a relation:
114
NITK, Release DB COURSE PLAN-2024-25
h
• job_id is a key in C2
Otherwise, when joining on these attributes, you may produce spurious combinations.
et
Table 1: Types of join
Join Type Description
ar
Lossless Join Reconstructs original data without spurious tuples
Lossy Join Results in incorrect or duplicate tuples
Lossless Criteria Common attributes must functionally determine one relation
hp
es
ur
r.S
D
115
30
h
Stored Procedures in Relational Databases
et
ar
A stored procedure is a precompiled collection of one or more SQL statements that can be
executed as a single unit. These procedures are stored within the database itself and can be
invoked from an application or another SQL command. Stored procedures are typically used
to encapsulate logic that can be reused and to enhance performance by reducing the amount
of code sent over the network. hp
Benefits of Stored Procedures
1. Performance: Stored procedures are precompiled, which means that the database en-
gine optimizes them ahead of time, reducing the processing time for execution. They
reduce the network overhead since only the procedure call is sent over the network
rather than sending multiple SQL queries.
es
2. Security: Stored procedures allow you to encapsulate complex logic and access control.
Permissions can be granted to users on the procedure rather than on the underlying
tables, providing an extra layer of security.
3. Maintainability: Once a stored procedure is created, it can be used multiple times in
ur
different applications. This reduces redundancy and makes maintenance easier, as you
only need to update the logic in one place.
4. Transaction Control: Stored procedures can contain logic to handle transactions (e.g.,
using BEGIN TRANSACTION, COMMIT, and ROLLBACK). This ensures that all opera-
r.S
tions within the procedure are completed successfully, or no changes are made if an
error occurs.
5. Reduced Complexity: Complex operations that require multiple steps can be encapsu-
lated in a stored procedure, allowing applications to call a single procedure rather than
multiple SQL statements.
D
116
NITK, Release DB COURSE PLAN-2024-25
DELIMITER $$
h
END $$
DELIMITER ;
et
In this example:
• DELIMITER $$: Changes the delimiter so that the semicolon (;) can be used inside the
procedure body.
ar
• IN student_id INT: Defines an input parameter called student_id.
• The body of the procedure contains a SELECT statement to retrieve student information.
PostgreSQL (Example)
hp
CREATE OR REPLACE FUNCTION GetStudentInfo(student_id INT)
RETURNS TABLE(name VARCHAR, age INT, course VARCHAR) AS $$
BEGIN
RETURN QUERY
SELECT name, age, course
FROM students
es
WHERE id = student_id;
END;
$$ LANGUAGE plpgsql;
In PostgreSQL, functions are used similarly to stored procedures. Here, RETURN QUERY is
ur
AS
BEGIN
SELECT name, age, course
FROM students
WHERE id = @student_id;
END;
D
In SQL Server, the parameters are defined with a @ symbol, and BEGIN...END is used to define
the block of code.
Types of Stored Procedures:
1. Simple Stored Procedures: These procedures are used to perform simple operations
such as inserts, updates, and deletes.
2. Parameterized Stored Procedures: These allow for input parameters to be passed
to the stored procedure. These parameters can be used in the SQL logic inside the
procedure.
117
NITK, Release DB COURSE PLAN-2024-25
3. Returning Results: Some stored procedures return a result set (similar to a query) to
the calling application or user.
4. Transactional Stored Procedures: These procedures contain BEGIN TRANSACTION,
COMMIT, and ROLLBACK statements to manage transactions. They ensure atomicity and
integrity of the operations.
Example Use Cases:
1. Data Validation: A stored procedure can be used to validate incoming data before it is
inserted into a table. For example, checking if a customer’s age is above a certain value
h
or if an email address follows a valid format.
2. Complex Business Logic: A stored procedure can encapsulate complex business logic
that involves multiple operations, such as calculating discounts, applying taxes, and up-
et
dating inventory when an order is placed.
3. Batch Processing: You can use stored procedures to perform batch processing, such
as updating records in bulk or aggregating data from multiple tables.
ar
4. Error Handling: Procedures can be used to handle errors systematically. For instance,
you can log errors into a separate error table whenever something goes wrong inside
the procedure.
5. Security Auditing: Procedures can be written to track who accessed sensitive data
hp
or performed certain operations, which is especially useful in environments requiring
compliance with standards like HIPAA or GDPR.
Example of Complex Stored Procedure (with Error Handling)
DELIMITER $$
es
CREATE PROCEDURE ProcessOrder(IN order_id INT, IN customer_id INT)
BEGIN
DECLARE total_amount DECIMAL(10,2);
DECLARE product_count INT;
DECLARE error_message VARCHAR(255);
ur
-- Start Transaction
START TRANSACTION;
118
NITK, Release DB COURSE PLAN-2024-25
-- Commit Transaction
COMMIT;
END $$
DELIMITER ;
In this example:
h
• It starts a transaction using START TRANSACTION.
• It checks if the customer exists. If not, it rolls back the transaction and raises an error
using SIGNAL.
et
• It checks if the total order amount exceeds a threshold. If it does, it rolls back the
transaction and raises an error.
• If everything is successful, it updates the order status and commits the transaction.
ar
Error Handling in Stored Procedures
Different databases offer various methods for handling errors within stored procedures:
• MySQL: You can use SIGNAL SQLSTATE to raise an error.
hp
• PostgreSQL: You can use EXCEPTION blocks to catch and handle errors.
• SQL Server: You can use TRY...CATCH blocks to handle errors.
Best Practices for Stored Procedures
1. Keep Logic Simple: The more complex your stored procedure becomes, the harder it
es
is to maintain. Try to keep procedures simple and focused on a single task.
2. Use Transaction Management: Ensure that your stored procedure uses proper trans-
action handling (e.g., BEGIN TRANSACTION, COMMIT, ROLLBACK) to guarantee atomicity.
3. Parameterize Queries: Avoid concatenating user input directly into SQL queries to
ur
6. Error Logging: Make use of error logging within the stored procedure for easier de-
bugging and to maintain records of any issues.
Stored procedures are a powerful feature in relational databases, offering benefits such as
performance optimization, security, and maintainability. They allow you to encapsulate com-
plex logic within the database, thus reducing the need for repetitive code in applications.
D
However, they should be used wisely, as they can complicate your database schema and
maintenance if not implemented properly.
119
31
h
MySQL Stored Procedure Examples
et
ar
This section provides examples of common stored procedures in MySQL using different pa-
rameter types and control structures.
1. Basic Stored Procedure (No Parameters)
sql hp
DELIMITER //
DELIMITER ;
Call:
ur
CALL GetAllStudents();
DELIMITER //
DELIMITER ;
Call:
CALL GetStudentById(101);
120
NITK, Release DB COURSE PLAN-2024-25
DELIMITER //
DELIMITER ;
Call:
h
CALL GetStudentName(101, @name);
SELECT @name;
et
4. Stored Procedure with INOUT Parameters
sql
ar
DELIMITER //
DELIMITER ;
hp
Call:
sql
DELIMITER //
END //
DELIMITER ;
Call:
121
NITK, Release DB COURSE PLAN-2024-25
sql
DELIMITER //
WHILE i <= 5 DO
SELECT i;
SET i = i + 1;
h
END WHILE;
END //
et
DELIMITER ;
Call:
CALL PrintNumbers();
ar
7. Stored Procedure with CURSOR
sql
DELIMITER //
OPEN cur;
read_loop: LOOP
ur
CLOSE cur;
END //
DELIMITER ;
D
Call:
CALL ListStudentNames();
122
32
h
MySQL Triggers
et
ar
Triggers are stored programs in MySQL that are automatically invoked in response to certain
events on a table.
What is a Trigger?
A trigger in MySQL is a stored database object that automatically executes (or fires) in re-
hp
sponse to a specific event such as:
• INSERT
• UPDATE
• DELETE
es
It’s like telling MySQL:
“Whenever this action happens on this table, automatically run this block of code.”
Why Use Triggers?
ur
123
NITK, Release DB COURSE PLAN-2024-25
h
Syntax
et
sql
ar
FOR EACH ROW
BEGIN
-- SQL statements
END;
Notes
hp
• Use FOR EACH ROW to trigger once per affected row.
• Use NEW.column_name for INSERT/UPDATE.
• Use OLD.column_name for UPDATE/DELETE.
es
• BEFORE triggers can modify values.
• AFTER triggers are read-only.
Examples
ur
);
DELIMITER //
BEGIN
INSERT INTO audit_log(student_id) VALUES (NEW.id);
END //
DELIMITER ;
124
NITK, Release DB COURSE PLAN-2024-25
DELIMITER //
h
DELIMITER ;
et
3. BEFORE DELETE: Archive record
DELIMITER //
ar
CREATE TRIGGER before_student_delete
BEFORE DELETE ON students
FOR EACH ROW
BEGIN hp
INSERT INTO students_archive SELECT * FROM students WHERE id = OLD.id;
END //
DELIMITER ;
Trigger Management
es
List triggers:
SHOW TRIGGERS;
Drop trigger:
ur
Limitations
• Cannot trigger on TRUNCATE.
r.S
125
33
h
Stored Procedures and Triggers in AI-Driven
et
Recruitment
ar
Overview
In an AI-powered recruitment system, stored procedures and triggers help ensure that com-
plex workflows and automated monitoring actions are executed reliably and efficiently.
hp
Use Case
When a candidate applies for a job, the system needs to:
1. Save the candidate’s basic information
2. Upload their resume reference
es
3. Insert the AI evaluation score
4. Update the job posting with the increased applicant count
To simplify and encapsulate these steps, a stored procedure called submit_application() is
ur
used.
Additionally, to monitor AI evaluations, an AFTER INSERT trigger on the ai_scores table is
defined to automatically log entries into an audit table.
Diagram: Stored Procedures and Triggers Flow
r.S
DELIMITER //
126
NITK, Release DB COURSE PLAN-2024-25
h
UPDATE jobs
SET applicant_count = applicant_count + 1
WHERE id = p_job_id;
et
END //
DELIMITER ;
ar
DELIMITER //
DELIMITER ;
es
Table 1: Benefits
Mechanism Purpose
ur
This architecture:
r.S
• Reduces redundancy
• Ensures consistency
• Improves auditability
D
127
34
h
Aggregate Functions in SQL
et
ar
What are Aggregate Functions?
Aggregate functions perform a calculation on a set of values and return a single summary
value. They are used often with GROUP BY.
Function
hp
Table 1: Common Aggregate Functions
Description
COUNT() Count rows
SUM() Add values
AVG() Average of values
es
MIN() Minimum value
MAX() Maximum value
GROUP_CONCAT() Concatenate values (MySQL only)
ur
Examples
COUNT()
SUM()
AVG()
GROUP_CONCAT()
128
NITK, Release DB COURSE PLAN-2024-25
Note
h
• NULLs are ignored by most aggregate functions (except COUNT(*)).
• Use HAVING to filter groups (not WHERE).
et
ar
hp
es
ur
r.S
D
129
35
h
Aggregate Functions in AI-Driven Recruitment
et
ar
Aggregate functions in SQL are used in AI-driven recruitment platforms to analyze and sum-
marize large volumes of candidate data generated by automated evaluation systems.
In particular, the ai_scores table stores scores for each candidate based on different skill
evaluations. These scores are analyzed using standard SQL aggregate functions.
hp
Use Case: Analyzing AI Score Data
Table 1: ai_scores
candidate_id skill score
es
1001 Python 85
1001 SQL 90
1002 Python 78
ur
130
NITK, Release DB COURSE PLAN-2024-25
h
et
ar
hp
Table 2: Function Description
Function Purpose
es
AVG(score) Used in HR dashboards to rank candidate averages
MAX(score) Identifies top skill scorers
MIN(score) Flags weak skill areas
COUNT(*) Tracks total number of evaluations
ur
SELECT candidate_id,
D
AVG(score) AS avg_score,
MAX(score) AS max_score,
MIN(score) AS min_score
FROM ai_scores
GROUP BY candidate_id;
131
36
h
Indexes in RDBMS
et
ar
What is an Index?
An index is a data structure that improves the speed of data retrieval operations on a database
table at the cost of additional storage and maintenance.
Why Use Indexes? hp
• Speed up SELECT queries
• Optimize WHERE, JOIN, ORDER BY, and GROUP BY
• Enable fast lookups and range searches
How Indexes Work
es
• Most indexes use B-Trees
• Index stores a sorted copy of key columns
• Maintains a pointer to actual row in the table
ur
Syntax
Example
132
NITK, Release DB COURSE PLAN-2024-25
h
• Requires tuning
Viewing Indexes
et
SHOW INDEX FROM students; -- MySQL
ar
hp
es
ur
r.S
D
133
37
h
B-Trees in RDBMS
et
ar
What is a B-Tree?
A B-Tree is a self-balancing search tree that maintains sorted data and allows searches, in-
sertions, and deletions in logarithmic time.
Properties of B-Trees hp
• Max m children per node
• At least ceil(m/2) children (except root)
• All leaves at the same level
• Keys in sorted order
es
Why Use in RDBMS?
• Reduces disk I/O
• Supports fast range queries
ur
• Start at root
• Binary search within node
• Traverse down recursively
Insert:
D
134
NITK, Release DB COURSE PLAN-2024-25
MySQL Example
h
CREATE TABLE students (
id INT PRIMARY KEY,
name VARCHAR(100),
et
marks INT,
INDEX idx_marks (marks)
);
ar
Benefits
• Balanced structure
• Good for large datasets
• Efficient disk usage hp
Limitations
• Complex to implement
• Slower than hash for point lookups
B-Trees in RDBMS (Order 3):
es
What is a B-Tree of Order 3?
A B-Tree of order 3 is a balanced search tree where:
• Each node can have at most 2 keys and 3 child pointers.
ur
135
NITK, Release DB COURSE PLAN-2024-25
h
Constructing a B-Tree of Order 3
Input:
et
Insert the following keys in order:
ar
• Max 2 keys per node
• Max 3 children per node
• Min 1 key per internal node (except the root)
hp
Step-by-step Construction:
1. Insert 10
[10]
es
2. Insert 20
[10, 20]
3. Insert 50
ur
New structure:
r.S
[20]
/ \
[10] [50]
4. Insert 40
D
[20]
/ \
[10] [40, 50]
5. Insert 30
136
NITK, Release DB COURSE PLAN-2024-25
[20, 40]
/ | \
[10] [30] [50]
6. Insert 70
h
70 > 40 → go to [50] → insert → [50, 70]
et
[20, 40]
/ | \
[10] [30] [50, 70]
7. Insert 80
ar
[50, 70] → insert → [50, 70, 80] → overflow
[40]
/ \
[20] [70]
/ \ / \
es
[10] [30] [50] [80]
8. Insert 60
[40]
/ \
[20] [70]
/ \ / \
[10] [30] [50, 60] [80]
r.S
9. Insert 25
[40]
D
/ \
[20] [70]
/ \ / \
[10] [25, 30] [50, 60] [80]
10. Insert 55
137
NITK, Release DB COURSE PLAN-2024-25
Final B-Tree:
[40]
/ \
[20] [55, 70]
/ \ / | \
h
[10] [25,30] [50] [60] [80]
et
Table 2: Advantages of B-Tree Indexes in RDBMS (Order
3)
Feature Benefit
ar
Balanced tree Predictable, fast lookup time (log n)
High fan-out (3 children per node) Fewer levels → less disk I/O
Sorted keys Efficient for range queries (<, BETWEEN)
138
38
h
B+ Trees in RDBMS
et
ar
What is a B+ Tree?
A B+ Tree is a balanced tree data structure used to store indexes in databases. It is an
optimized version of the B-Tree where:
• All data is stored in leaf nodes. hp
• Internal nodes store only keys for navigation.
• Leaf nodes are linked together for fast range queries.
139
39
h
Constructing a B+ Tree of Order 3
et
ar
Input Sequence:
Insert 50
[20]
/ \
[10] [20, 50]
D
140
NITK, Release DB COURSE PLAN-2024-25
[20, 40]
/ | \
[10] [20] [40, 50]
h
[20, 40]
/ | \
[10] [20, 30] [40, 50]
et
Leaf Links: [10] → [20, 30] → [40, 50]
Insert 60
ar
Insert into [40, 50] → becomes [40, 50, 60] → overflow
[40]
/ \
[20] [50]
es
/ \ / \
[10] [20,30] [40] [50,60]
[40]
/ \
[20] [50, 60]
/ \ / | \
[10] [20,30] [40] [50] [60,80]
D
141
NITK, Release DB COURSE PLAN-2024-25
[40, 60]
/ | \
[20] [50] [70]
/ \ / \ | \
[10] [20,30] [40] [50][60] [70,80]
h
Final Leaf Links:
et
[10] → [20,30] → [40] → [50] → [60] → [70,80]
Use in MySQL
ar
CREATE TABLE students (
id INT PRIMARY KEY,
name VARCHAR(100),
marks INT,
INDEX idx_marks (marks) hp
);
Benefits
D
142
40
h
Views in RDBMS
et
ar
What is a View?
A view is a virtual table that is defined by an SQL query. It does not store data itself but
provides a way to represent data from one or more base tables.
• Views behave like tables in SELECT queries.
hp
• The data in a view is fetched dynamically from the underlying tables.
View Syntax
CREATE VIEW
r.S
DROP VIEW
DROP VIEW view_name;
Example
CREATE TABLE students (
id INT,
(continues on next page)
143
NITK, Release DB COURSE PLAN-2024-25
h
SELECT * FROM high_scorers;
et
Table 2: Types of Views
Type Description
Simple View Based on a single table, no aggregates
ar
Complex View Uses joins, aggregates, subqueries
Updatable View Allows INSERT/UPDATE/DELETE (if criteria met)
Read-only View Cannot be updated (e.g., uses GROUP BY)
Materialized View Physically stores data (not supported in MySQL)
-- Non-updatable example
CREATE VIEW dept_avg AS
SELECT department, AVG(marks)
ur
FROM students
GROUP BY department;
Advantages of Views
• Encapsulate query complexity
• Support data security and access control
• Provide abstraction layer between schema and users
• Enable code reuse
Limitations of Views
144
NITK, Release DB COURSE PLAN-2024-25
h
et
ar
hp
es
ur
r.S
D
145
41
h
Views in AI-Driven Recruitment
et
ar
What is a View?
A view in an AI-driven recruitment system is a virtual table that presents selected and sim-
plified data drawn from one or more underlying base tables. It allows stakeholders like HR
managers, administrators, or AI dashboards to access relevant information without directly
hp
interacting with complex joins or sensitive data.
Use Case
In a recruitment platform, data may be spread across multiple tables:
• candidates: stores personal and job application details
es
• ai_scores: stores evaluation results generated by the AI engine
• jobs: contains job listings and metadata
To streamline access, a view named candidate_summary is created.
ur
This view allows users to: - See candidates’ names, emails, applied job titles - View average
AI scores for shortlisting
Diagram: View Integration
D
Diagram Description:
146
NITK, Release DB COURSE PLAN-2024-25
h
Simplification Hides complexity of joins and aggregations
Security Restricts direct access to sensitive base tables
Reusability Reused across dashboards and reporting modules
et
Logical Abstraction Interface remains stable even if schema changes
Security Use
You can expose only the view to certain users:
ar
GRANT SELECT ON candidate_summary TO hr_user;
This way, HR sees evaluated summaries without needing access to raw AI scores or candidate
profiles. hp
es
ur
r.S
D
147
42
h
Security and Backup in RDBMS
et
ar
Security in RDBMS
Definition:
Security in RDBMS involves protecting the integrity, confidentiality, and accessibility of
database data through authentication, authorization, and access control.
hp
Key Mechanisms:
Example:
r.S
Best Practices:
• Use least privilege principle
D
148
NITK, Release DB COURSE PLAN-2024-25
Types of Backup:
Type Description
Full Backup Copy of entire database
Incremental Only changes since the last backup
Differential Changes since last full backup
Logical Export of SQL commands (e.g., mysqldump, pg_dump)
Physical File-level copy of database files
h
Examples:
et
# MySQL Full Backup
mysqldump -u root -p mydb > backup.sql
# PostgreSQL Backup
pg_dump mydb > backup.pgsql
ar
Restore:
Table 2: Summary
ur
Category Summary
Security Protect data using roles, views, encryption, and auditing
Backup Ensure data safety via regular full/incremental backups
r.S
D
149
43
h
Transactions and Related Topics in RDBMS
et
ar
What is a Transaction?
A transaction is a group of SQL operations that execute as a single unit. It either fully
completes (COMMIT) or fully aborts (ROLLBACK).
Property
hp
Table 1: ACID Properties
Description
Atomicity All operations succeed or none at all
Consistency Database moves from one valid state to another
Isolation Transactions do not affect each other
es
Durability Changes persist even after system failure
START TRANSACTION;
-- SQL operations
COMMIT;
-- or to undo
ROLLBACK;
r.S
Example:
START TRANSACTION;
UPDATE accounts SET balance = balance - 500 WHERE id = 1;
UPDATE accounts SET balance = balance + 500 WHERE id = 2;
COMMIT;
D
Isolation Levels
150
NITK, Release DB COURSE PLAN-2024-25
h
SERIALIZABLE No No No
et
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
ar
Issue Description
Dirty Read Reading uncommitted data from another transaction
Non-Repeatable Read Re-running query gives different results
Phantom Read New rows appear in repeated queries
Savepoints
hp
Savepoints are markers within a transaction to allow partial rollback.
START TRANSACTION;
SAVEPOINT sp1;
es
UPDATE employees SET salary = salary + 1000 WHERE id = 1;
ROLLBACK TO sp1;
COMMIT;
Autocommit
ur
By default, some RDBMS (e.g., MySQL) commit each statement unless disabled:
SET autocommit = 0;
r.S
Table 4: Transactions
Concept Use
Transaction Logical unit of database operations
COMMIT Apply all changes permanently
ROLLBACK Undo changes in case of error
D
151
44
h
Additional SQL Transaction-Based Questions and
et
Answers
ar
Question 1:
Write an SQL transaction to handle candidate withdrawal from a job application. Ensure that
the corresponding entry in the Application table is deleted, and if the deletion fails, rollback
hp
the changes.
Answer:
START TRANSACTION;
SAVEPOINT after_delete;
-- If no error:
ur
COMMIT;
-- If error occurs:
-- ROLLBACK TO after_delete;
-- COMMIT;
r.S
Answer:
START TRANSACTION;
SAVEPOINT before_update;
UPDATE ai_scores
SET score = 92
WHERE candidate_id = 123 AND skill = 'Python';
152
NITK, Release DB COURSE PLAN-2024-25
-- If not:
-- ROLLBACK TO before_update;
-- COMMIT;
h
Question 3:
Write an SQL transaction to assign a new job posting by inserting into the Jobs table and then
automatically logging the insertion into an audit_log table. Rollback if either step fails.
et
Answer:
START TRANSACTION;
ar
INSERT INTO Jobs (job_id, title, department)
VALUES (301, 'Data Analyst', 'AI');
SAVEPOINT after_job_insert;
hp
INSERT INTO audit_log (operation, entity, timestamp)
VALUES ('INSERT', 'Jobs', NOW());
COMMIT;
Write an SQL transaction to simulate a candidate moving from the ‘shortlisted’ to the ‘se-
lected’ stage. Update the status and create a backup entry in a status_history table.
Answer:
START TRANSACTION;
r.S
UPDATE Application
SET status = 'selected'
WHERE candidate_id = 123 AND job_id = 456;
SAVEPOINT after_status_update;
D
COMMIT;
153
NITK, Release DB COURSE PLAN-2024-25
START TRANSACTION;
UPDATE Jobs
h
SET status = 'closed'
WHERE job_id = 456;
et
DELETE FROM Application
WHERE job_id = 456;
SAVEPOINT after_cleanup;
ar
INSERT INTO audit_log (operation, entity, details, timestamp)
VALUES ('DELETE', 'Application', 'All candidates for job 456', NOW());
COMMIT;
-- If logging fails:
-- ROLLBACK TO after_cleanup;
-- COMMIT;
hp
Purpose: Ensures the job closing process includes cleanup and logging or is completely
undone on failure.
es
Question-6:
Write an SQL transaction for an AI-based recruitment system specified in Question No. 3
that ensures the data consistency when a candidate applies for a job. The transaction should
consist of the following steps:
ur
154
NITK, Release DB COURSE PLAN-2024-25
Explanation:
h
Step Description
et
START TRANSACTION Begins an atomic transaction block
INSERT INTO Attempts to insert a candidate’s application to the database
SAVEPOINT Creates a rollback checkpoint after successful insertion
COMMIT Finalizes all changes if successful
ar
ROLLBACK TO SAVEPOINT Undoes partial changes if something goes wrong
This transaction ensures that the system maintains data integrity and avoids storing partial
or failed application data.
hp
es
ur
r.S
D
155
45
h
Transaction Management in AI-Driven Recruitment
et
ar
Context
In an AI-powered recruitment system, a candidate’s application involves multiple steps that
must be atomically committed to ensure data integrity.
Transactional Steps
START TRANSACTION;
hp
INSERT INTO candidates (...) VALUES (...);
INSERT INTO resumes (...) VALUES (...);
INSERT INTO assessments (...) VALUES (...);
es
UPDATE jobs SET applicant_count = applicant_count + 1 WHERE ...;
COMMIT;
ACID Application
ur
Concurrency Example
D
Savepoints
SAVEPOINT after_resume;
-- AI score fails
ROLLBACK TO after_resume;
156
46
h
Serialization in RDBMS
et
ar
What is Serialization?
Serialization ensures that the concurrent execution of transactions produces the same
result as if the transactions were executed serially (one after another).
Why is it Important? hp
• Prevents concurrency issues such as dirty reads, phantom reads
• Ensures ACID compliance, especially Isolation
• Maintains consistency even in multi-user environments
Serializable Isolation Level
es
Technique Description
Two-Phase Locking (2PL) All locks acquired before releasing any
Timestamp Ordering Transactions ordered by timestamps
Serializable Snapshot Isolation (SSI) | Prevents anomalies without explicit locks
Predicate Locking Locks ranges or sets based on conditions
D
Table 2: Trade-Offs
Pros Cons
Strong consistency Lower concurrency
Prevents all anomalies Can cause deadlocks
157
NITK, Release DB COURSE PLAN-2024-25
Concept Serialization
Goal Make concurrent transactions behave like serial execution
SQL Support SERIALIZABLE isolation level
Techniques Used 2PL, SSI, Timestamps, Predicate Locks
h
et
ar
hp
es
ur
r.S
D
158
47
h
Serialization in AI-Driven Recruitment
et
ar
What is Serialization?
In a multi-user AI-driven recruitment platform, multiple candidates might apply to the same
job simultaneously. Serialization ensures that these concurrent transactions behave as if
they were executed one after another, maintaining data consistency.
hp
Use Case
Assume two candidates (A and B) are applying to the same job posting at the same time. Each
of their application workflows includes:
• Inserting candidate record
es
• Uploading resume
• Submitting AI-evaluated assessment
• Updating job applicant count
To avoid lost updates or inconsistent scores, the database uses serialization techniques
ur
to ensure:
1. Transaction T1 (Candidate A) completes entirely.
2. Only then does Transaction T2 (Candidate B) start.
r.S
159
NITK, Release DB COURSE PLAN-2024-25
h
et
ar
hp
es
ur
r.S
D
Explanation:
• T1 and T2 represent two application processes.
• A serialization control layer ensures only one transaction modifies the job data at a
time.
160
NITK, Release DB COURSE PLAN-2024-25
START TRANSACTION;
-- Candidate A operations (Insert, Upload, Score, Update)
COMMIT;
h
-- Candidate B's transaction begins only after A completes
et
Benefits
• Prevents race conditions in updating shared tables
• Ensures AI evaluations and candidate metadata remain correct
ar
• Guarantees fairness in job application ordering
hp
es
ur
r.S
D
161