0% found this document useful (0 votes)
2 views20 pages

M5 - Relational Model and Relational Database

The document provides an overview of the Relational Model, emphasizing its structure of tables (relations) made up of rows (tuples) and columns (attributes). It discusses design issues in relational databases, such as redundancy, update anomalies, and data integrity, and introduces various types of keys essential for maintaining data integrity. Additionally, it explains functional dependencies, their importance in database design, and the concept of normalization to avoid anomalies.

Uploaded by

arunabhagain07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views20 pages

M5 - Relational Model and Relational Database

The document provides an overview of the Relational Model, emphasizing its structure of tables (relations) made up of rows (tuples) and columns (attributes). It discusses design issues in relational databases, such as redundancy, update anomalies, and data integrity, and introduces various types of keys essential for maintaining data integrity. Additionally, it explains functional dependencies, their importance in database design, and the concept of normalization to avoid anomalies.

Uploaded by

arunabhagain07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

📘 Relational Model and Relational Database Design

🔹 1. Concept of Relational Model


📌 Definition:
The Relational Model is a way of representing data in tables (called relations). Each relation is
made of rows (tuples) and columns (attributes).

🧾 Key Features:
• Data is stored in relations (tables).
• Each row represents a tuple (record).
• Each column represents an attribute (field).
• Each relation has a unique name.

📊 Example:
Student_ID Name Dept
101 Ravi CSE
102 Priya IT
103 Amit ECE
Here, Student is the relation (table) with 3 attributes: Student_ID, Name, Dept.

📘
🔰
Design Issues
Introduction
When designing a relational database, we need to ensure that the database is efficient, free from
unnecessary duplication, and easy to update. Poor design can lead to data redundancy,
inconsistencies, and anomalies.
The goal is to create a design that:
• Minimizes redundancy
• Avoids anomalies
• Maintains data integrity
• Is easy to query and update
✅ Major Design Issues
1. Redundancy
📌 What is it?
Redundancy means storing the same data in multiple places.
⚠️Problem:
• Wastes storage space

🧾
• Increases chances of inconsistency if data is not updated everywhere
Example:
A student’s department name is stored in every student record. If the department name changes, all
records must be updated.

2. Update Anomalies
📌 What is it?
Problems that occur when we try to update data in a poorly designed database.
⚠️Types:
• Update Anomaly: Changing one piece of data requires updating multiple rows.
• Insert Anomaly: Cannot add data because other required data is missing.

🧾
• Delete Anomaly: Deleting data may unintentionally remove important facts.
Example:
• Deleting a student may also remove the only record of a department.

3. Null Values
📌 What is it?
A null value means missing or unknown data.
⚠️Problem:
• Difficult to understand and handle in queries

🧾
• May lead to errors in calculations
Example:
If a teacher has not yet been assigned to a course, the “Teacher_ID” field might be null.
4. Spurious Tuples (Fake Data)
📌 What is it?
When we join poorly designed tables, it may create fake rows that don’t exist in the real world.
⚠️Problem:
• Produces incorrect information

🧾
• Leads to confusion and wrong reports
Example:
If student and course tables are joined on the wrong key, we might get combinations that never
existed.

5. Loss of Information
📌 What is it?
When decomposing tables to remove redundancy, we might lose some original information.
⚠️Problem:
• Important data may become unavailable

🔍
• Cannot reconstruct original table properly
Solution:
Use lossless decomposition where the original table can be recovered exactly.

6. Dependency Preservation
📌 What is it?
All functional dependencies should still be valid after decomposition.
⚠️Problem:

🔍
• If dependencies are not preserved, data integrity rules cannot be enforced.
Solution:
Check and ensure all original dependencies still apply in the new design.

7. Data Integrity
📌 What is it?

📏
Ensuring the accuracy, consistency, and reliability of data.
Integrity Constraints:
• Entity Integrity: Primary key should not be null.
• Referential Integrity: Foreign key must match a primary key or be null.
8. Efficient Query Processing
📌 What is it?
Design must allow faster retrieval and manipulation of data.
⚠️Problem:

🔍
Too many tables or poor indexing can slow down queries.
Solution:
Balance between normalization and performance.

🔑

Keys in Relational Model (DBMS)
What is a Key?
A key is an attribute (column) or a set of attributes that is used to uniquely identify a row (tuple) in
a relation (table).
Keys are very important for:
• Maintaining data integrity
• Avoiding duplicate records
• Defining relationships between tables

🔎 Types of Keys
1. Super Key
📘 Definition:

🧾
A super key is any set of attributes that uniquely identifies a tuple (row) in a relation.
Example:
In a Student table:
Student(RollNo, Name, Email, Phone)

Possible super keys:


• {RollNo}
• {RollNo, Name}
• {Email}


• {Phone, RollNo}
Note: Super key may contain extra attributes.
2. Candidate Key
📘 Definition:
A candidate key is the minimal super key — it has no extra attribute and still uniquely identifies
each tuple.
⚠️Rule:
• Must be unique

🧾
• Must be minimal
Example:
From above table:


• {RollNo}, {Email} — both can be candidate keys (assuming they are unique)
Out of all super keys, the most efficient and minimal ones are called candidate keys.

3. Primary Key
📘 Definition:
A primary key is one of the candidate keys chosen by the database designer to uniquely identify
records.
⚠️Rules:
• Cannot be null
• Must be unique

🧾
• Only one primary key per table
Example:
In Student table, if RollNo is selected, then:
PRIMARY KEY = {RollNo}

4. Alternate Key
📘 Definition:

🧾
All candidate keys not chosen as the primary key are called alternate keys.
Example:
If RollNo is the primary key, then Email (another candidate key) is the alternate key.
5. Composite Key
📘 Definition:
A composite key is a key that consists of two or more attributes that together uniquely identify a

🧾
tuple.
Example:
In a table storing course enrollments:
Enrollment(StudentID, CourseID, Date)

Here, neither StudentID nor CourseID alone is unique, but the combination {StudentID,
CourseID} is unique.

So:
COMPOSITE KEY = {StudentID, CourseID}

6. Foreign Key
📘 Definition:

🧾
A foreign key is an attribute in one table that refers to the primary key of another table.
Example:
Two tables:
Student(RollNo, Name)

Course(CourseID, RollNo)


Here, RollNo in Course is a foreign key referring to Student.RollNo.

Used to establish relationships between tables.

📋 Summary Table of Keys

✅ ✅
Key Type Definition Uniqueness Null Allowed Example
Any set of attributes that Yes (if not {RollNo}, {RollNo,
Super Key Yes

✅ ❌
uniquely identifies a row PK) Name}
Candidate Minimal super key with no

✅ ❌
Yes No {RollNo}, {Email}
Key extra attribute

✅ ❌
Primary Key Selected candidate key Yes No {RollNo}
Alternate Other candidate keys not
Yes No {Email}

✅ ❌
Key selected as primary
Composite Key formed by multiple {StudentID,


Yes No


Key attributes CourseID}
Refers to primary key in Not RollNo in Course
Foreign Key Yes
another table always refers Student
🔐 Closure of Attribute Set

📘 What is Closure?
The closure of an attribute set is the set of all attributes that can be functionally determined
from a given set of attributes using a set of functional dependencies (FDs).
It helps us:
• Find candidate keys
• Check normalization
• Test if a decomposition is lossless
• Preserve dependencies

🧠 Formal Definition
Let:
• F be a set of functional dependencies

• X be a set of attributes

Then the closure of X, denoted as X⁺, is the set of all attributes A such that X → A can be derived
from F.

🎯 Purpose of Finding X⁺ (X Closure)


1. To determine candidate keys of a relation.
2. To test dependency preservation in decomposition.
3. To check if a functional dependency X → Y holds under a given set of FDs.

🔁 Algorithm: How to Find Closure (X⁺)


Step-by-Step:
1. Start with X⁺ = X (initially the same as the given attribute set)
2. Repeat:
• For each functional dependency A → B in F:
• If A ⊆ X⁺, then add B to X⁺
3. Stop when no more attributes can be added to X⁺
🧾 Example
Relation:
R(A, B, C, D, E)

Functional Dependencies (F):


1. A → B
2. B → C
3. A → D

Task: Find closure of {A}, i.e., A⁺


Step-by-step:
Step Current X⁺ FD Applied Reason
1 {A} A→B A is in X⁺, so add B
2 {A, B} B→C B is in X⁺, so add C
3 {A, B, C} A→D A is in X⁺, so add D


4 {A, B, C, D} No more FDs can be applied
Final Closure: A⁺ = {A, B, C, D}
Done

🔍 Another Example
Relation:
R(P, Q, R, S)

FDs:
1. P → Q
2. Q → R
3. R → S

Find: P⁺
Step X⁺ FD Used Result
1 {P} P→Q Add Q
2 {P, Q} Q→R Add R


3 {P, Q, R} R → S
P⁺ = {P, Q, R, S}
Add S

❓ Use Case: Checking Candidate Key


A candidate key must functionally determine all attributes in the relation.
Example:
If R = (A, B, C, D, E) and A⁺ = {A, B, C, D, E},
then A is a candidate key.

📘 Functional Dependency in DBMS

🧠 What is Functional Dependency?


In a Relational Database, Functional Dependency (FD) is a relationship between two sets of

🔹
attributes of a relation (table).

Easy Meaning:
If we know the value of one attribute (or a group of attributes), and we can uniquely identify the

👉
value of another attribute, then we say:
"One attribute functionally determines the other."

🧾 Example:
Imagine a table Student with columns:
RollNo, Name, Department, Phone

Now look at this:


• If we know the RollNo, we can find out:
• the Name of the student
• the Department
• the Phone number
So we can say:
RollNo → Name

RollNo → Department

RollNo → Phone

This means RollNo functionally determines Name, Department, and Phone.

📙 Formal Definition
A Functional Dependency X → Y between two sets of attributes X and Y in a relation means:
If two rows (tuples) have the same value of X, then they must have the same value of Y.
🔐 Real-Life Example
In a library database:
Book Table
ISBN, Title, Author, Publisher

Here:
• Every ISBN is unique for each book.
• If you know the ISBN, you can find the exact Title, Author, and Publisher.
So we write:

📌
ISBN → Title, Author, Publisher

Why is Functional Dependency Important?


Functional Dependencies help in:
• Understanding how attributes are related.
• Finding the primary key or candidate keys.
• Normalizing the database to remove redundancy.
• Avoiding update, insertion, and deletion anomalies.

🔄 Different Types of Functional Dependencies

1. 🔹 Trivial Functional Dependency


A Functional Dependency is trivial when:

🧾
The right side is already included in the left side.
Example:
• A → A (trivial)
• (A, B) → A (trivial)
✔ These are always true and not very useful.

2. 🔹 Non-Trivial Functional Dependency


A dependency is non-trivial when:

🧾
The right side is not included in the left side.
Example:
• RollNo → Name (non-trivial)
✔ These are useful for designing a database.

3. 🔹 Full Functional Dependency


A dependency X → Y is called fully functional when:

🧾
Y depends on the whole of X, not just a part.
Example:


In table Enrollment(StudentID, CourseID, Grade)


• (StudentID, CourseID) → Grade (Full FD)
• StudentID → Grade (Partial FD – not full)

4. 🔹 Partial Functional Dependency


This happens when:

🧾
Y depends on part of a composite key (not the whole).
Example:
If:
• Primary Key = (StudentID, CourseID)
• FD: StudentID → Grade (only part of the key)
Then it is a partial dependency, which is not allowed in 2NF.

5. 🔹 Transitive Functional Dependency


Transitive dependency occurs when:

🧾
X → Y and Y → Z, then X → Z
Example:
• A → B and B → C
• So A → C (transitive)
This is not allowed in 3NF.

🔁 Rules to Derive Functional Dependencies


(📜 Armstrong’s Axioms)
These rules help you generate new FDs from given FDs.

Rule Description
Reflexivity If Y is part of X, then X → Y
Augmentation If X → Y, then XZ → YZ (adding same things to both sides)
Transitivity If X → Y and Y → Z, then X → Z
📊 Functional Dependency Diagram (Arrow Notation)
Here's how we show FDs using arrows:
RollNo ──────→ Name

(FD)

Another example:
(StudentID, CourseID) ──────→ Grade

📘 Anomalies in Database Design

✅ What are Anomalies?


In DBMS, anomalies are problems or errors that occur when a database is poorly designed,
usually in un-normalized tables.
They can lead to:
• Redundant data
• Inconsistent data
• Difficulty in updating/deleting/inserting data

🔎 When Do Anomalies Happen?


Anomalies mostly happen when:
• All data is stored in one big table (1NF)
• There is repeated (redundant) data
• There are dependencies between non-key columns

🧩 Types of Anomalies
There are mainly three types of anomalies:

1. ✏️Insertion Anomaly
This happens when we cannot insert a new record into the database without including unrelated
data.
🧾 Example:
Student_Course Table

StudentID StudentName Course Instructor


101 Rahul DBMS Prof. Das

👉
102 Priya OS Prof. Singh
Now, suppose we want to add a new course that has no students enrolled yet.


We cannot insert it unless we also insert a fake or dummy student.
Problem:
We are forced to insert unnecessary data, or we can't insert valid data.

2. 📝 Update Anomaly
This happens when same data is repeated in many rows, and we have to update it in all places. If

🧾
we forget one, data becomes inconsistent.
Example:

StudentID StudentName Course Instructor


101 Rahul DBMS Prof. Das

👉
105 Neha DBMS Prof. Das


If Prof. Das gets replaced by Prof. Sharma for DBMS, we must update both rows.
Problem:
If we forget one row, data becomes wrong or inconsistent.

3. 🗑️ Deletion Anomaly

🧾
This happens when deleting a row also removes valuable data that we still need.
Example:

StudentID StudentName Course Instructor

👉
103 Amit AI Prof. Roy
If Amit drops out and we delete his row, we also lose information about the AI course and


Prof. Roy.
Problem:
Deleting one thing causes loss of other useful data.

🛠️ Solution: Normalization
To solve anomalies, we apply Normalization.
• Break big, redundant tables into smaller related tables
• Ensure each table has a clear purpose and dependencies
• Apply normal forms: 1NF → 2NF → 3NF → BCNF

📘 Normalization and Normal Forms

✅ What is Normalization?
Normalization is a process in DBMS to organize data in a database by:
• Reducing redundancy (duplicate data)
• Eliminating anomalies (insertion, update, deletion problems)
• Ensuring data is stored logically and efficiently
It breaks a big table into smaller related tables while maintaining relationships.

🔄 Why Normalize?
Poor database design leads to:
• Data repetition
• Wasted space
• Difficult updates
• Wrong/incomplete results
Normalization solves these problems.

📊 Normal Forms (NF)


There are different levels or “forms” of normalization. Each level builds upon the previous one.

🔹 1NF – First Normal Form


✅ Rule:
• Each cell should contain atomic (single) values


• No repeating groups or arrays
Not in 1NF:

StudentID Name Courses


101 Raj DBMS, OS
➡ Courses column has multiple values
✅ In 1NF:

StudentID Name Course


101 Raj DBMS
101 Raj OS

🔹 2NF – Second Normal Form


✅ Rule:
• Must be in 1NF

👉
• All non-key attributes must be fully functionally dependent on the entire primary key


Removes partial dependency
Not in 2NF (composite key problem):

StudentID Course StudentName Instructor


Here:
• Primary Key = (StudentID, Course)


• StudentName depends only on StudentID → partial dependency
In 2NF:
Student Table:

StudentID StudentName
101 Raj
Course Table:

Course Instructor
DBMS Prof. Das
Enrollment Table:

StudentID Course
101 DBMS

🔹 3NF – Third Normal Form


✅ Rule:
• Must be in 2NF


• No transitive dependency (non-key attribute depends on another non-key attribute)
Not in 3NF:

EmpID EmpName DeptID DeptName


Here:
• DeptName depends on DeptID, not directly on EmpID

• EmpID → DeptID → DeptName (transitive dependency)
In 3NF:
Employee Table:

EmpID EmpName DeptID


1 Anil D1
Department Table:

DeptID DeptName
D1 Sales

🔹 BCNF – Boyce-Codd Normal Form


✅ Rule:
• Stronger version of 3NF


• For every functional dependency X → Y, X must be a super key
Not in BCNF:

Course Instructor Room


DBMS Prof. A R101
Assume:
• One course has one instructor
• One room can have only one instructor
So:
• Course → Instructor
• Room → Instructor


But neither Course nor Room is a super key.
In BCNF:
Break the table into two:
Course_Instructor Table:

Course Instructor
DBMS Prof. A
Room_Instructor Table:

Room Instructor
R101 Prof. A

🔹 4NF – Fourth Normal Form


✅ Rule:
• Must be in BCNF
👉
• No multi-valued dependencies
Multi-valued dependency means: one attribute has multiple independent values for a single


key.
Not in 4NF:

Student Hobby Language


Raj Cricket English
Raj Chess English
Raj Cricket Hindi
Raj Chess Hindi
Hobby and Language are independent but both related to Student → causes multi-valued


dependency
In 4NF:
Student_Hobby Table:

Student Hobby
Raj Cricket
Raj Chess
Student_Language Table:

Student Language
Raj English
Raj Hindi

🔹 5NF – Fifth Normal Form (Project-Join Normal Form)


✅ Rule:
• Must be in 4NF
• No join dependency or loss of data when joining
This form ensures that if a table is split into multiple tables and then re-joined, no data is lost.
Example:
Not needed in most practical databases. Used only in very complex cases where:
• A relation is reconstructed only by joining multiple tables

📘
• Join dependencies are not preserved

Decomposition in DBMS
✅ What is Decomposition?
Decomposition is the process of breaking a single relation (table) into two or more smaller
relations without losing any information.
It is used in Normalization to remove problems like redundancy, partial dependency,
transitive dependency, etc.

🔄 Why Decompose?
To solve the problems of:
• Redundancy (duplicate data)
• Update Anomalies
• Insertion Anomalies
• Deletion Anomalies

🎯 Goal of Decomposition
A good decomposition should satisfy:
1. Lossless-Join property (no data loss)
2. Dependency Preservation (all functional dependencies are maintained)
Let’s explain both:

1️⃣Lossless Join Decomposition


✅ This means no information is lost when the smaller tables are joined back.

📌
If original table = join of smaller tables → it is lossless
Rule: At least one common attribute between decomposed tables must be a key in one of them.

2️⃣Dependency Preservation
✅ This means all functional dependencies from the original table should be available in the
decomposed tables.
If any dependency is lost, it might not be possible to enforce integrity constraints.

📊 Example of Decomposition
🔸 Original Table (Not Normalized):
Roll Name Dept Dept_Location
101 Raj CSE Block A
Functional Dependencies:
• Roll → Name, Dept
• Dept → Dept_Location
🔄 Decomposed Tables:
Student Table:

Roll Name Dept


101 Raj CSE
Department Table:

Dept Dept_Location
CSE Block A


This decomposition:


• Is Lossless (we can join using Dept)

• Preserves Dependencies (FDs are still present)

❗ Types of Decomposition
Type Description
Lossless Decomposition No loss of data during decomposition and recombination
Lossy Decomposition Some data is lost when joining back the tables
Dependency Preserving All original FDs are present after decomposition
Non-Dependency Preserving Some FDs are missing in the decomposed tables

⚠️Problem: Lossy Decomposition


If not done properly, decomposition may result in lossy joins.

Example of Lossy Decomposition:


Original Table:

EmpID EmpName Project


1 Anil Alpha
If we decompose into:
Emp Table:

EmpID EmpName
1 Anil
Project Table:

EmpID Project

👉
1 Alpha
When joining these two, we may get extra or wrong combinations if multiple projects or
employees exist.
Hence, this can be lossy.
🧠 Conditions for Good Decomposition
Condition Explanation
Lossless Join You must be able to get the original table by joining decomposed tables
Dependency Preservation All original FDs must be derivable from decomposed relations
Minimal Redundancy No duplicate or unnecessary data after decomposition

You might also like