M5 - Relational Model and Relational Database
M5 - Relational Model and Relational Database
🧾 Key Features:
• Data is stored in relations (tables).
• Each row represents a tuple (record).
• Each column represents an attribute (field).
• Each relation has a unique name.
📊 Example:
Student_ID Name Dept
101 Ravi CSE
102 Priya IT
103 Amit ECE
Here, Student is the relation (table) with 3 attributes: Student_ID, Name, Dept.
📘
🔰
Design Issues
Introduction
When designing a relational database, we need to ensure that the database is efficient, free from
unnecessary duplication, and easy to update. Poor design can lead to data redundancy,
inconsistencies, and anomalies.
The goal is to create a design that:
• Minimizes redundancy
• Avoids anomalies
• Maintains data integrity
• Is easy to query and update
✅ Major Design Issues
1. Redundancy
📌 What is it?
Redundancy means storing the same data in multiple places.
⚠️Problem:
• Wastes storage space
🧾
• Increases chances of inconsistency if data is not updated everywhere
Example:
A student’s department name is stored in every student record. If the department name changes, all
records must be updated.
2. Update Anomalies
📌 What is it?
Problems that occur when we try to update data in a poorly designed database.
⚠️Types:
• Update Anomaly: Changing one piece of data requires updating multiple rows.
• Insert Anomaly: Cannot add data because other required data is missing.
🧾
• Delete Anomaly: Deleting data may unintentionally remove important facts.
Example:
• Deleting a student may also remove the only record of a department.
3. Null Values
📌 What is it?
A null value means missing or unknown data.
⚠️Problem:
• Difficult to understand and handle in queries
🧾
• May lead to errors in calculations
Example:
If a teacher has not yet been assigned to a course, the “Teacher_ID” field might be null.
4. Spurious Tuples (Fake Data)
📌 What is it?
When we join poorly designed tables, it may create fake rows that don’t exist in the real world.
⚠️Problem:
• Produces incorrect information
🧾
• Leads to confusion and wrong reports
Example:
If student and course tables are joined on the wrong key, we might get combinations that never
existed.
5. Loss of Information
📌 What is it?
When decomposing tables to remove redundancy, we might lose some original information.
⚠️Problem:
• Important data may become unavailable
🔍
• Cannot reconstruct original table properly
Solution:
Use lossless decomposition where the original table can be recovered exactly.
6. Dependency Preservation
📌 What is it?
All functional dependencies should still be valid after decomposition.
⚠️Problem:
🔍
• If dependencies are not preserved, data integrity rules cannot be enforced.
Solution:
Check and ensure all original dependencies still apply in the new design.
7. Data Integrity
📌 What is it?
📏
Ensuring the accuracy, consistency, and reliability of data.
Integrity Constraints:
• Entity Integrity: Primary key should not be null.
• Referential Integrity: Foreign key must match a primary key or be null.
8. Efficient Query Processing
📌 What is it?
Design must allow faster retrieval and manipulation of data.
⚠️Problem:
🔍
Too many tables or poor indexing can slow down queries.
Solution:
Balance between normalization and performance.
🔑
✅
Keys in Relational Model (DBMS)
What is a Key?
A key is an attribute (column) or a set of attributes that is used to uniquely identify a row (tuple) in
a relation (table).
Keys are very important for:
• Maintaining data integrity
• Avoiding duplicate records
• Defining relationships between tables
🔎 Types of Keys
1. Super Key
📘 Definition:
🧾
A super key is any set of attributes that uniquely identifies a tuple (row) in a relation.
Example:
In a Student table:
Student(RollNo, Name, Email, Phone)
✅
• {Phone, RollNo}
Note: Super key may contain extra attributes.
2. Candidate Key
📘 Definition:
A candidate key is the minimal super key — it has no extra attribute and still uniquely identifies
each tuple.
⚠️Rule:
• Must be unique
🧾
• Must be minimal
Example:
From above table:
⏩
• {RollNo}, {Email} — both can be candidate keys (assuming they are unique)
Out of all super keys, the most efficient and minimal ones are called candidate keys.
3. Primary Key
📘 Definition:
A primary key is one of the candidate keys chosen by the database designer to uniquely identify
records.
⚠️Rules:
• Cannot be null
• Must be unique
🧾
• Only one primary key per table
Example:
In Student table, if RollNo is selected, then:
PRIMARY KEY = {RollNo}
4. Alternate Key
📘 Definition:
🧾
All candidate keys not chosen as the primary key are called alternate keys.
Example:
If RollNo is the primary key, then Email (another candidate key) is the alternate key.
5. Composite Key
📘 Definition:
A composite key is a key that consists of two or more attributes that together uniquely identify a
🧾
tuple.
Example:
In a table storing course enrollments:
Enrollment(StudentID, CourseID, Date)
Here, neither StudentID nor CourseID alone is unique, but the combination {StudentID,
CourseID} is unique.
So:
COMPOSITE KEY = {StudentID, CourseID}
6. Foreign Key
📘 Definition:
🧾
A foreign key is an attribute in one table that refers to the primary key of another table.
Example:
Two tables:
Student(RollNo, Name)
Course(CourseID, RollNo)
✅
Here, RollNo in Course is a foreign key referring to Student.RollNo.
✅ ✅
Key Type Definition Uniqueness Null Allowed Example
Any set of attributes that Yes (if not {RollNo}, {RollNo,
Super Key Yes
✅ ❌
uniquely identifies a row PK) Name}
Candidate Minimal super key with no
✅ ❌
Yes No {RollNo}, {Email}
Key extra attribute
✅ ❌
Primary Key Selected candidate key Yes No {RollNo}
Alternate Other candidate keys not
Yes No {Email}
✅ ❌
Key selected as primary
Composite Key formed by multiple {StudentID,
❌
Yes No
✅
Key attributes CourseID}
Refers to primary key in Not RollNo in Course
Foreign Key Yes
another table always refers Student
🔐 Closure of Attribute Set
📘 What is Closure?
The closure of an attribute set is the set of all attributes that can be functionally determined
from a given set of attributes using a set of functional dependencies (FDs).
It helps us:
• Find candidate keys
• Check normalization
• Test if a decomposition is lossless
• Preserve dependencies
🧠 Formal Definition
Let:
• F be a set of functional dependencies
• X be a set of attributes
Then the closure of X, denoted as X⁺, is the set of all attributes A such that X → A can be derived
from F.
✅
4 {A, B, C, D} No more FDs can be applied
Final Closure: A⁺ = {A, B, C, D}
Done
🔍 Another Example
Relation:
R(P, Q, R, S)
FDs:
1. P → Q
2. Q → R
3. R → S
Find: P⁺
Step X⁺ FD Used Result
1 {P} P→Q Add Q
2 {P, Q} Q→R Add R
✅
3 {P, Q, R} R → S
P⁺ = {P, Q, R, S}
Add S
🔹
attributes of a relation (table).
Easy Meaning:
If we know the value of one attribute (or a group of attributes), and we can uniquely identify the
👉
value of another attribute, then we say:
"One attribute functionally determines the other."
🧾 Example:
Imagine a table Student with columns:
RollNo, Name, Department, Phone
RollNo → Department
RollNo → Phone
📙 Formal Definition
A Functional Dependency X → Y between two sets of attributes X and Y in a relation means:
If two rows (tuples) have the same value of X, then they must have the same value of Y.
🔐 Real-Life Example
In a library database:
Book Table
ISBN, Title, Author, Publisher
Here:
• Every ISBN is unique for each book.
• If you know the ISBN, you can find the exact Title, Author, and Publisher.
So we write:
📌
ISBN → Title, Author, Publisher
🧾
The right side is already included in the left side.
Example:
• A → A (trivial)
• (A, B) → A (trivial)
✔ These are always true and not very useful.
🧾
The right side is not included in the left side.
Example:
• RollNo → Name (non-trivial)
✔ These are useful for designing a database.
🧾
Y depends on the whole of X, not just a part.
Example:
✅
In table Enrollment(StudentID, CourseID, Grade)
❌
• (StudentID, CourseID) → Grade (Full FD)
• StudentID → Grade (Partial FD – not full)
🧾
Y depends on part of a composite key (not the whole).
Example:
If:
• Primary Key = (StudentID, CourseID)
• FD: StudentID → Grade (only part of the key)
Then it is a partial dependency, which is not allowed in 2NF.
🧾
X → Y and Y → Z, then X → Z
Example:
• A → B and B → C
• So A → C (transitive)
This is not allowed in 3NF.
Rule Description
Reflexivity If Y is part of X, then X → Y
Augmentation If X → Y, then XZ → YZ (adding same things to both sides)
Transitivity If X → Y and Y → Z, then X → Z
📊 Functional Dependency Diagram (Arrow Notation)
Here's how we show FDs using arrows:
RollNo ──────→ Name
(FD)
Another example:
(StudentID, CourseID) ──────→ Grade
🧩 Types of Anomalies
There are mainly three types of anomalies:
1. ✏️Insertion Anomaly
This happens when we cannot insert a new record into the database without including unrelated
data.
🧾 Example:
Student_Course Table
👉
102 Priya OS Prof. Singh
Now, suppose we want to add a new course that has no students enrolled yet.
❌
We cannot insert it unless we also insert a fake or dummy student.
Problem:
We are forced to insert unnecessary data, or we can't insert valid data.
2. 📝 Update Anomaly
This happens when same data is repeated in many rows, and we have to update it in all places. If
🧾
we forget one, data becomes inconsistent.
Example:
👉
105 Neha DBMS Prof. Das
❌
If Prof. Das gets replaced by Prof. Sharma for DBMS, we must update both rows.
Problem:
If we forget one row, data becomes wrong or inconsistent.
3. 🗑️ Deletion Anomaly
🧾
This happens when deleting a row also removes valuable data that we still need.
Example:
👉
103 Amit AI Prof. Roy
If Amit drops out and we delete his row, we also lose information about the AI course and
❌
Prof. Roy.
Problem:
Deleting one thing causes loss of other useful data.
🛠️ Solution: Normalization
To solve anomalies, we apply Normalization.
• Break big, redundant tables into smaller related tables
• Ensure each table has a clear purpose and dependencies
• Apply normal forms: 1NF → 2NF → 3NF → BCNF
✅ What is Normalization?
Normalization is a process in DBMS to organize data in a database by:
• Reducing redundancy (duplicate data)
• Eliminating anomalies (insertion, update, deletion problems)
• Ensuring data is stored logically and efficiently
It breaks a big table into smaller related tables while maintaining relationships.
🔄 Why Normalize?
Poor database design leads to:
• Data repetition
• Wasted space
• Difficult updates
• Wrong/incomplete results
Normalization solves these problems.
❌
• No repeating groups or arrays
Not in 1NF:
👉
• All non-key attributes must be fully functionally dependent on the entire primary key
❌
Removes partial dependency
Not in 2NF (composite key problem):
✅
• StudentName depends only on StudentID → partial dependency
In 2NF:
Student Table:
StudentID StudentName
101 Raj
Course Table:
Course Instructor
DBMS Prof. Das
Enrollment Table:
StudentID Course
101 DBMS
❌
• No transitive dependency (non-key attribute depends on another non-key attribute)
Not in 3NF:
DeptID DeptName
D1 Sales
❌
• For every functional dependency X → Y, X must be a super key
Not in BCNF:
✅
But neither Course nor Room is a super key.
In BCNF:
Break the table into two:
Course_Instructor Table:
Course Instructor
DBMS Prof. A
Room_Instructor Table:
Room Instructor
R101 Prof. A
❌
key.
Not in 4NF:
✅
dependency
In 4NF:
Student_Hobby Table:
Student Hobby
Raj Cricket
Raj Chess
Student_Language Table:
Student Language
Raj English
Raj Hindi
📘
• Join dependencies are not preserved
Decomposition in DBMS
✅ What is Decomposition?
Decomposition is the process of breaking a single relation (table) into two or more smaller
relations without losing any information.
It is used in Normalization to remove problems like redundancy, partial dependency,
transitive dependency, etc.
🔄 Why Decompose?
To solve the problems of:
• Redundancy (duplicate data)
• Update Anomalies
• Insertion Anomalies
• Deletion Anomalies
🎯 Goal of Decomposition
A good decomposition should satisfy:
1. Lossless-Join property (no data loss)
2. Dependency Preservation (all functional dependencies are maintained)
Let’s explain both:
📌
If original table = join of smaller tables → it is lossless
Rule: At least one common attribute between decomposed tables must be a key in one of them.
2️⃣Dependency Preservation
✅ This means all functional dependencies from the original table should be available in the
decomposed tables.
If any dependency is lost, it might not be possible to enforce integrity constraints.
📊 Example of Decomposition
🔸 Original Table (Not Normalized):
Roll Name Dept Dept_Location
101 Raj CSE Block A
Functional Dependencies:
• Roll → Name, Dept
• Dept → Dept_Location
🔄 Decomposed Tables:
Student Table:
Dept Dept_Location
CSE Block A
✅
This decomposition:
✅
• Is Lossless (we can join using Dept)
❗ Types of Decomposition
Type Description
Lossless Decomposition No loss of data during decomposition and recombination
Lossy Decomposition Some data is lost when joining back the tables
Dependency Preserving All original FDs are present after decomposition
Non-Dependency Preserving Some FDs are missing in the decomposed tables
❌
If not done properly, decomposition may result in lossy joins.
EmpID EmpName
1 Anil
Project Table:
EmpID Project
👉
1 Alpha
When joining these two, we may get extra or wrong combinations if multiple projects or
employees exist.
Hence, this can be lossy.
🧠 Conditions for Good Decomposition
Condition Explanation
Lossless Join You must be able to get the original table by joining decomposed tables
Dependency Preservation All original FDs must be derivable from decomposed relations
Minimal Redundancy No duplicate or unnecessary data after decomposition