0% found this document useful (0 votes)
36 views55 pages

Summary of Chapter 14

Chapter 14 covers the fundamentals of functional dependencies and normalization in relational databases, emphasizing the importance of clear attribute semantics and reducing redundancy and NULL values. It introduces functional dependencies, keys, and the process of normalization through First, Second, and Third Normal Forms, providing examples to illustrate each concept. The chapter also discusses the implications of design violations and the significance of avoiding anomalies in database design.

Uploaded by

wafa.alazzeh49
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views55 pages

Summary of Chapter 14

Chapter 14 covers the fundamentals of functional dependencies and normalization in relational databases, emphasizing the importance of clear attribute semantics and reducing redundancy and NULL values. It introduces functional dependencies, keys, and the process of normalization through First, Second, and Third Normal Forms, providing examples to illustrate each concept. The chapter also discusses the implications of design violations and the significance of avoiding anomalies in database design.

Uploaded by

wafa.alazzeh49
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 55

summary of Chapter 14: Basics of Functional Dependencies and Normalization for

Relational Databases from Fundamentals of Database Systems (7th Edition) by Ramez


Elmasri and Shamkant B. Navathe, supplemented with illustrative examples:

■ Guideline 1 Making sure that the semantics of the attributes is clear in the schema.
Design a relation schema so that it is easy to explain its meaning. Do not combine
attributes from multiple entity types and relationship types into a single relation.

■ Guideline 2 Reducing the redundant information in tuples

■ Guideline 3 Reducing the NULL values in tuples

■ Guideline 4 Disallowing the possibility of generating spurious tuples

🔍 Functional Dependencies (FDs)

A functional dependency (FD) is a constraint between two sets of attributes in a relation.


Formally, in a relation R, an FD X → Y holds if, for any two tuples in R, whenever the
tuples agree on attributes X, they also agree on attributes Y. This implies that the values of
Y are determined by the values of X.

Example:

Consider a relation Employee with attributes:

 EmpID
 EmpName
 DeptID
 DeptName

If each employee belongs to one department, and each department has a unique name, the
following FDs hold:

 EmpID → EmpName, DeptID


 DeptID → DeptName

This means:

 Knowing EmpID allows us to determine the corresponding EmpName and DeptID.


 Knowing DeptID allows us to determine the corresponding DeptName.

🔑 Keys and Superkeys


 A superkey is a set of attributes that uniquely identifies a tuple in a relation.
 A candidate key is a minimal superkey; that is, a superkey with no redundant
attributes.
 A primary key is a chosen candidate key to uniquely identify tuples in a relation.

Example:

In the Employee relation:

 EmpID is a candidate key because it uniquely identifies each employee.


 {EmpID, DeptID} is a superkey but not a candidate key, as EmpID alone suffices.

🧱 Normalization and Normal Forms

Normalization is the process of organizing data in a database to reduce redundancy and


improve data integrity. It involves decomposing relations into well-structured forms based
on functional dependencies.

First Normal Form (1NF):

A relation is in 1NF if:

 All attributes contain only atomic (indivisible) values.


 There are no repeating groups or arrays.

Second Normal Form (2NF):

A relation is in 2NF if:

 It is in 1NF.
 Every non-prime attribute is fully functionally dependent on the entire primary key.

Third Normal Form (3NF):

A relation is in 3NF if:

 It is in 2NF.
 There are no transitive dependencies; that is, non-prime attributes are not dependent
on other non-prime attributes.

Example:

Consider a relation Student with attributes:


 StudentID (Primary Key)
 StudentName
 Major
 AdvisorName

Assuming each major has a specific advisor, the FD Major → AdvisorName holds. This
introduces a transitive dependency:

 StudentID → Major
 Major → AdvisorName

To achieve 3NF, we decompose the relation:

1. Student(StudentID, StudentName, Major)


2. MajorAdvisor(Major, AdvisorName)

This eliminates the transitive dependency and ensures that each relation is in 3NF.

Sure! Let's go through more examples of normalization using simple data to help you
clearly see how each normal form works and why we use it.

🧱 First Normal Form (1NF)

Problem: Repeating Groups

StudentID Name Courses


1 Alice Math, English
2 Bob Science

Here, the Courses attribute contains multiple values. This violates 1NF.

Solution:

Break the repeating group into multiple rows:

StudentID Name Course


1 Alice Math
1 Alice English
2 Bob Science

✅ Now it’s in 1NF: all values are atomic (no multi-valued attributes).
🧱 Second Normal Form (2NF)

Problem: Partial Dependencies

StudentID Course StudentName


1 Math Alice
1 English Alice
2 Science Bob

Primary key = (StudentID, Course)

But StudentName depends only on StudentID, not on the full key (StudentID, Course) →
partial dependency = 🚫 violates 2NF.

Solution: Decompose

Students

StudentID StudentName
1 Alice
2 Bob

Enrollments

StudentID Course
1 Math
1 English
2 Science

✅ Now each non-key attribute depends on the full primary key → 2NF.

🧱 Third Normal Form (3NF)

Problem: Transitive Dependencies

EmpI EmpName DeptI DeptName


D D
1 Alice 10 HR
2 Bob 20 IT

Here:

 EmpID → DeptID
 DeptID → DeptName
 So: EmpID → DeptName (a transitive dependency)

DeptName depends on DeptID, which is not a key, so this violates 3NF.

Solution: Decompose

Employees

EmpID EmpName DeptID


1 Alice 10
2 Bob 20

Departments

DeptI DeptName
D
10 HR
20 IT

✅ No transitive dependency now → 3NF

🏅 Summary of the Normal Forms with Hints

Normal Form Fixes Look for


1NF Repeating groups Multi-valued attributes
2NF Partial dependencies Non-key columns depending on part of key
3NF Transitive dependencies Non-key depending on other non-key columns

Perfect, stopping at 3NF is totally fine—that’s what most real-world databases aim for
anyway. You've now got a solid grasp of:

✅ 1NF — No repeating groups


✅ 2NF — No partial dependencies
✅ 3NF — No transitive dependencies

Great! Let’s go through the Review Questions from Chapter 14 one by one, with clear
and simple answers + examples where needed. This will help reinforce your understanding
of the key ideas in this chapter.
✅ 14.1 Discuss attribute semantics as an informal measure of goodness for a relation
schema.

Answer:

 Attribute semantics refers to the meaning of each attribute and how it relates to
others in a relation.
 A relation schema is considered “good” if:
o All its attributes describe the same real-world concept.
o Each attribute is directly related to the key.

Why it matters:

 If unrelated attributes are in the same table, it leads to redundancy and anomalies.

Example: A relation like:

| StudentID | StudentName | CourseID | InstructorName |

This is bad if a student takes multiple courses—then InstructorName is not really about the
student, leading to redundancy.

Better design:

 One table for Students


 One for Courses
 A relation for Enrollments

✅ 14.2. Discuss insertion, deletion, and modification anomalies. Why are they
considered bad? Illustrate with examples.

Answer:

 Insertion anomaly: You can’t insert data without some unrelated info.
o Ex: Can’t add a new course unless a student is enrolled.
 Deletion anomaly: Deleting one fact deletes unrelated info.
o Ex: Deleting the last student from a course deletes the course data.
 Modification anomaly: Updating info in one place but forgetting in another causes
inconsistency.
o Ex: Instructor name updated for one row, but not others.

Illustration:
StudentID StudentName CourseID Instructor
1 Alice CS101 Dr. Smith
2 Bob CS101 Dr. Smith

If Dr. Smith changes to Dr. John and we forget to update one row, we have inconsistency.

✅ 14.3. Why should NULLs in a relation be avoided as much as possible? Discuss the
problem of spurious tuples and how we may prevent it. Avoiding NULLs:

 NULLs mean “unknown” or “not applicable” — they can cause:


o Confusing query results
o Difficulty with aggregate functions (SUM, AVG)
o More complex logic (IS NULL, IS NOT NULL)

Spurious Tuples:

 These are meaningless rows that result from improper joins.


 Caused by missing or incorrect join conditions.
 Prevented by:
o Using proper keys
o Only decomposing with lossless-join properties

✅ 14.4. State the informal guidelines for relation schema design that we discussed.
Illustrate how violation of these guidelines may be harmful.

Guidelines:

1. Each relation should describe one concept.


2. Attributes should be directly related to the key.
3. Avoid redundancy and NULLs.
4. Avoid spurious tuples by proper decomposition.

Violation Example: | StudentID | StudentName | DeptID | DeptName | DeptLocation |

Here, student info and department info are mixed → if one department has no students, it
might not be recorded at all.
✅ 14.5. What is a functional dependency? What are the possible sources of the
information that defines the functional dependencies that hold among the attributes of
a relation schema? Answer:

 A functional dependency (FD) is when one attribute (or set) determines another.
o Notation: A → B (if we know A, we know B)
 Sources of FDs:
o Real-world rules (e.g., "each employee has a unique ID")
o Business policies
o Application constraints
o Domain knowledge

✅ 14.6. Why can we not infer a functional dependency automatically from a particular
relation state? Answer:

 A relation state is just one snapshot of data.


 Functional dependencies must hold for all possible states of the relation.
 Just because an FD seems to hold now doesn’t mean it’s always true.

Example:

EmpI Name
D
1 Alice
2 Alice

Right now, EmpID → Name seems violated, but what if later Alice is a typo and its two
different people?

Conclusion: FDs come from the meaning and rules of the data, not from just one sample.

✅ 14.7. What does the term unnormalized relation refer to? How did the normal forms
develop historically from first normal form up to Boyce-Codd normal form?

Unnormalized Relation (UNF):

 A relation that contains repeating groups or multi-valued attributes.


 Not suitable for relational databases.

Example:
StudentID Name Courses
1 Amy Math, English

Normalization History:

 1NF: Eliminate repeating groups → atomic values.


 2NF: Eliminate partial dependencies.
 3NF: Eliminate transitive dependencies.
 BCNF: Eliminate all anomalies caused by any dependency where determinant is not
a superkey.

✅ 14.8. Define first, second, and third normal forms when only primary keys are
considered. How do the general definitions of 2NF and 3NF, which consider all keys of
a relation, differ from those that consider only primary keys? When only primary
keys are considered:

 1NF: Atomic values.


 2NF: No partial dependency on the primary key.
 3NF: No transitive dependency through primary key.

General definitions:

 Consider all candidate keys, not just the primary key.


 Ensures that all forms of redundancy are avoided, not just ones tied to the primary
key.

✅ 14.9. What undesirable dependencies are avoided when a relation is in 2NF?

 Partial dependencies, where a non-key attribute depends only on part of a


composite key.
 2NF ensures all non-key attributes are fully functionally dependent on the whole
key.

✅ 14.10. What undesirable dependencies are avoided when a relation is in 3NF?

 Transitive dependencies:
o When A → B and B → C, then A → C is transitive.
 3NF ensures that non-key attributes don’t depend on other non-key attributes.
✅ 14.11. In what way do the generalized definitions of 2NF and 3NF extend the defi
nitions beyond primary keys?

 Generalized definitions look at all candidate keys, not just the primary one.
 This prevents anomalies that would still exist if you only normalize based on the
primary key.

14.19. Suppose that we have the following requirements for a university database that is
used to keep track of students’ transcripts:

a. The university keeps track of each student’s name (Sname), student num ber (Snum),
Social Security number (Ssn), current address (Sc_addr) and phone (Sc_phone), permanent
address (Sp_addr) and phone (Sp_phone), birth date (Bdate), sex (Sex), class (Class)
(‘freshman’, ‘sophomore’, … , ‘graduate’), major department (Major_code), minor
department (Minor_code) (if any), and degree program (Prog) (‘b.a.’, ‘b.s.’, … , ‘ph.d.’).
Both Ssn and student number have unique values for each student.

b. Each department is described by a name (Dname), department code (Dcode), office


number (Doffice), office phone (Dphone), and college (Dcollege). Both name and code
have unique values for each department.

c. Each course has a course name (Cname), description (Cdesc), course number (Cnum),
number of semester hours (Credit), level (Level), and offering department (Cdept). The
course number is unique for each course.

d. Each section has an instructor (Iname), semester (Semester), year (Year), course
(Sec_course), and section number (Sec_num). The section number distinguishes different
sections of the same course that are taught during the same semester/year; its values are
1, 2, 3, … , up to the total number of sections taught during each semester.

e. A grade record refers to a student (Ssn), a particular section, and a grade (Grade).

Design a relational database schema for this database application. First show all the
functional dependencies that should hold among the attributes. Then design relation
schemas for the database that are each in 3NF or BCNF. Spec ify the key attributes of each
relation. Note any unspecified requirements, and make appropriate assumptions to render
the specification complete.
This is a classic relational database design question—and a perfect way to apply
everything you've learned about functional dependencies and normalization up to
3NF/BCNF.

Let’s break it down step by step:

✅ STEP 1: Identify Functional Dependencies (FDs)

We'll group FDs based on the given entity types:

a. Student Info

Attributes: Ssn, Snum → all student info (they're both unique IDs)

So:

Ssn → Sname, Snum, Sc_addr, Sc_phone, Sp_addr, Sp_phone,


Bdate, Sex, Class, Major_code, Minor_code, Prog
Snum → same as above

b. Department

Attributes: Dcode, Dname → rest of department info

FDs:

Dcode → Dname, Doffice, Dphone, Dcollege


Dname → Dcode, Doffice, Dphone, Dcollege

c. Course

Attributes: Cnum is unique

FD:

Cnum → Cname, Cdesc, Credit, Level, Cdept

d. Section

Attributes: Section is identified by: (Sec_course, Semester, Year, Sec_num)

FD:

(Sec_course, Semester, Year, Sec_num) → Iname


e. Grade

A grade is identified by:

 (Ssn, Sec_course, Semester, Year, Sec_num)

FD:

(Ssn, Sec_course, Semester, Year, Sec_num) → Grade

✅ STEP 2: Normalize into 3NF/BCNF

1. STUDENT Relation

Since both Ssn and Snum are unique, we can choose one as the primary key (let's use
Ssn).

Schema:

STUDENT(Ssn, Snum, Sname, Sc_addr, Sc_phone, Sp_addr,


Sp_phone, Bdate, Sex, Class, Major_code, Minor_code, Prog)

 Primary Key: Ssn


 Alt Key: Snum
 In BCNF since all attributes depend fully on Ssn and Snum.

2. DEPARTMENT Relation

Schema:

DEPARTMENT(Dcode, Dname, Doffice, Dphone, Dcollege)

 Primary Key: Dcode


 Alt Key: Dname
 In BCNF

3. COURSE Relation

Schema:

COURSE(Cnum, Cname, Cdesc, Credit, Level, Cdept)

 Primary Key: Cnum


 In BCNF
4. SECTION Relation

Schema:

SECTION(Sec_course, Semester, Year, Sec_num, Iname)

 Primary Key: (Sec_course, Semester, Year, Sec_num)


 In BCNF

5. GRADE_RECORD Relation

Combining student + section + grade:

Schema:

GRADE_RECORD(Ssn, Sec_course, Semester, Year, Sec_num, Grade)

 Primary Key: (Ssn, Sec_course, Semester, Year, Sec_num)


 In BCNF

6. MAJOR and MINOR Relations (Optional)

To allow referential integrity between students and departments:

MAJOR(Ssn, Major_code)
MINOR(Ssn, Minor_code)

✅ Assumptions Made

 A student can have at most one major and one minor.


 Each course number (Cnum) is unique globally, not just per department.
 Instructor names (Iname) are not unique — if uniqueness is needed, we would need
an INSTRUCTOR table.

✅ Summary of Relations in 3NF/BCNF

Relation Primary Key(s)


STUDENT Ssn
DEPARTMENT Dcode (alt key: Dname)
COURSE Cnum
SECTION (Sec_course, Semester, Year, Sec_num)
GRADE_RECORD (Ssn, Sec_course, Semester, Year, Sec_num)
Relation Primary Key(s)
MAJOR Ssn
MINOR Ssn

14.20. What update anomalies occur in the EMP_PROJ and EMP_DEPT relations of Figures
14.3 and 14.4? To answer Review Question 14.20, we need to examine the update
anomalies in the EMP_PROJ and EMP_DEPT relations from Figures 14.3 and 14.4 of
Fundamentals of Database Systems (7th Edition).
Since I can’t directly show those figures, I’ll summarize their structure and the typical
anomalies they illustrate.

📘 Figure 14.3 – EMP_PROJ Relation


This relation likely includes:
Emp_ID, Emp_Name, Job, Proj_ID, Proj_Name, Hours
❗ Update Anomalies in EMP_PROJ:
1. Insertion Anomaly:
o If a new employee is hired but not yet assigned to a project, you can't insert
the employee's data unless you leave Proj_ID, Proj_Name, and Hours NULL or
insert dummy data.
2. Deletion Anomaly:
o If an employee is removed from their last project (e.g., you delete their row),
all their employee data might be lost—even though they still exist in the
company.
3. Modification Anomaly:
o If the same employee works on multiple projects, and you update their Job
title in one row but forget in others, inconsistent data results.

📘 Figure 14.4 – EMP_DEPT Relation


This relation likely includes:
Emp_ID, Emp_Name, Job, Dept_ID, Dept_Name, Dept_Location
❗ Update Anomalies in EMP_DEPT:
1. Insertion Anomaly:
o Can’t add a new department unless at least one employee works there, or
else you insert dummy employee info.
2. Deletion Anomaly:
o If the only employee in a department leaves, deleting their row deletes all
info about the department too.
3. Modification Anomaly:
o If multiple employees are in the same department and the department’s
location changes, you have to update all rows—risking inconsistency.

✅ Summary
Both EMP_PROJ and EMP_DEPT violate normalization (probably stuck in 1NF or 2NF),
which leads to:
 Redundant data
 Inconsistent updates
 Difficulty managing partial info
Perfect! Let’s normalize both EMP_PROJ and EMP_DEPT relations
into 3NF (or BCNF) to eliminate the update anomalies we just
discussed.
📌 1. EMP_PROJ – Before Normalization
Original Relation:
EMP_PROJ(Emp_ID, Emp_Name, Job, Proj_ID, Proj_Name, Hours)
Functional Dependencies:
 Emp_ID → Emp_Name, Job
 Proj_ID → Proj_Name
 (Emp_ID, Proj_ID) → Hours

✅ After Normalization:
a) EMPLOYEE
EMPLOYEE(Emp_ID, Emp_Name, Job)
b) PROJECT
PROJECT(Proj_ID, Proj_Name)
c) WORKS_ON
WORKS_ON(Emp_ID, Proj_ID, Hours)
 Composite Primary Key: (Emp_ID, Proj_ID)
🔧 Benefits:
 No redundancy: employee or project info isn’t duplicated.
 No anomalies: you can add/remove employees/projects without dummy data.

📌 2. EMP_DEPT – Before Normalization


Original Relation:
EMP_DEPT(Emp_ID, Emp_Name, Job, Dept_ID, Dept_Name, Dept_Location)
Functional Dependencies:
 Emp_ID → Emp_Name, Job, Dept_ID
 Dept_ID → Dept_Name, Dept_Location

✅ After Normalization:
a) EMPLOYEE
EMPLOYEE(Emp_ID, Emp_Name, Job, Dept_ID)
b) DEPARTMENT
DEPARTMENT(Dept_ID, Dept_Name, Dept_Location)
 Dept_ID is a foreign key in EMPLOYEE.
🔧 Benefits:
 Can add departments without employees.
 Can delete employees without losing department info.
 Any changes to department name/location only updated once.

🎯 Final Result – Fully Normalized (3NF/BCNF)


Table Primary Key Notes
EMPLOYEE Emp_ID Includes foreign key to
DEPARTMENT
PROJECT Proj_ID
WORKS_ON (Emp_ID, Proj_ID) With Hours
DEPARTMEN Dept_ID
T

14.21. In what normal form is the LOTS relation schema in Figure 14.12(a) with respect to
the restrictive interpretations of normal form that take only the primary key into account?
Would it be in the same normal form if the general definitions of normal form were used?
Great question! Let's break it down by referring to Figure 14.12(a) from Fundamentals of
Database Systems (7th Edition), which shows the LOTS relation:

📘 LOTS Relation Schema (Figure 14.12a)


LOTS(Property_ID, County_Name, Lot_Number, Area, Price, Tax_Rate)
Assume:
 Primary Key: (Property_ID)
 Other attributes like County_Name, Lot_Number, Area, Price, Tax_Rate are
functionally dependent on Property_ID.

🔹 Functional Dependencies (FDs):


Let’s assume the following typical FDs based on the semantics:
1. Property_ID → County_Name, Lot_Number, Area, Price, Tax_Rate
2. County_Name → Tax_Rate (since counties have fixed tax rates)

🔍 Interpretation 1: Restrictive (Primary Key-Based) Normal Forms


✅ 1NF: Yes — assuming all attributes are atomic.
✅ 2NF: Yes — because all non-key attributes are fully functionally dependent on the entire
primary key (Property_ID is atomic and no partial dependencies exist).
✅ 3NF (Restrictive Definition): Yes — under restrictive interpretation, only dependencies
on the primary key are considered. Since all non-key attributes depend on the primary
key, it passes 3NF.
✅ So under restrictive (primary-key-only) interpretation: LOTS is in 3NF.

🔍 Interpretation 2: General (All Keys-Based) Normal Forms


Problem:
There’s a transitive dependency:
Property_ID → County_Name → Tax_Rate
That violates 3NF (general definition), because:
 Tax_Rate is not transitively dependent on a candidate key.
 County_Name is not a superkey, and Tax_Rate is not a prime attribute.
❌ Therefore:
 LOTS is NOT in 3NF under the general definition.
 Also not in BCNF, because County_Name → Tax_Rate and County_Name is not a
superkey.

✅ Fix (to bring it to 3NF/BCNF under general rules):


Split LOTS into two relations:
a) LOTS: LOTS(Property_ID, County_Name, Lot_Number, Area, Price)
b) COUNTY_TAX: COUNTY_TAX(County_Name, Tax_Rate)

🧠 Summary:
Interpretation Is LOTS in Why?
3NF?
Restrictive (Primary Key ✅ Yes All non-key attrs depend only on the
only) primary key
General (All FDs, incl. ❌ No County_Name → Tax_Rate is a transitive
transitives) dependency
14.24. Consider the universal relation R = {A, B, C, D, E, F, G, H, I, J} and the set of
functional dependencies F = {{A, B}→{C}, {A}→{D, E}, {B}→{F}, {F}→{G, H}, {D}→{I, J}}. What
is the key for R? Decompose R into 2NF and then 3NF relations.
Let's solve Question 14.24 step-by-step:

🧩 Given:
 Universal Relation R = {A, B, C, D, E, F, G, H, I, J}
 Functional Dependencies F:
1. {A, B} → {C}
2. A → {D, E}
3. B → {F}
4. F → {G, H}
5. D → {I, J}
✅ Step 1: Find the Candidate Key
We compute the closure of {A, B} (since it appears on the LHS of multiple FDs):
Start with:
(A, B)+ = {A, B}
1. {A, B} → C ⇒ add C
→ {A, B, C}
2. A → D, E ⇒ add D, E
→ {A, B, C, D, E}
3. B → F ⇒ add F
→ {A, B, C, D, E, F}
4. F → G, H ⇒ add G, H
→ {A, B, C, D, E, F, G, H}
5. D → I, J ⇒ add I, J
→ {A, B, C, D, E, F, G, H, I, J} = R
✅ So, {A, B} is a candidate key

🔁 Step 2: Decompose into 2NF


✅ Check for Partial Dependencies:
2NF removes partial dependencies (where a non-prime attribute depends on part of a
candidate key).
Candidate Key: (A, B)
Check if any attribute is dependent on A only or B only:
 A → D, E ⇒ Partial dependency ❌
 B → F ⇒ Partial dependency ❌
 F → G, H ⇒ Not partial, F isn’t part of key
 D → I, J ⇒ Not partial, D isn’t part of key

✅ 2NF Decomposition:
Break out partial dependencies:
1. R1(A, B, C) – (Core relation, full FD: {A, B} → C)
2. R2(A, D, E) – (From A → D, E)
3. R3(B, F) – (From B → F)
We now add remaining transitive dependencies:
4. R4(F, G, H) – (From F → G, H)
5. R5(D, I, J) – (From D → I, J)

🔁 Step 3: Check for 3NF


3NF removes transitive dependencies (non-prime attribute depending on another non-
prime attribute).
From our 2NF decomposition:
Relation Reason 3NF?
R1(A, B, C) {A, B} → C (key → non-prime) ✅
R2(A, D, E) A → D, E (key → non-prime) ✅
R3(B, F) B → F (key → non-prime) ✅
R4(F, G, F → G, H (key → non-prime) ✅
H)
R5(D, I, J) D → I, J (key → non-prime) ✅
✅ All relations are in 3NF!

✅ Final 3NF Decomposition:


Relation Attributes Primary Key
R1 A, B, C A, B
R2 A, D, E A
R3 B, F B
R4 F, G, H F
R5 D, I, J D
14.25. Repeat Exercise 14.24 for the following different set of functional dependencies G =
{{A, B}→{C}, {B, D}→{E, F}, {A, D}→{G, H}, {A}→{I}, {H}→{J}}.
Let's solve Exercise 14.25 step by step.

🧩 Given:
 Universal Relation R = {A, B, C, D, E, F, G, H, I, J}
 Functional Dependencies G:
1. {A, B} → {C}
2. {B, D} → {E, F}
3. {A, D} → {G, H}
4. A → I
5. H → J

✅ Step 1: Find a Candidate Key


We test closure of combinations of attributes to find a key.
Try {A, B, D}:
Start with:
(A, B, D)+ = {A, B, D}
Apply FDs:
1. {A, B} → C ⇒ add C
2. {B, D} → E, F ⇒ add E, F
3. {A, D} → G, H ⇒ add G, H
4. A → I ⇒ add I
5. H → J ⇒ add J
So: (A, B, D)+ = {A, B, C, D, E, F, G, H, I, J} = R
✅ So, {A, B, D} is a candidate key.
🔁 Step 2: Decompose into 2NF
Check for partial dependencies:
A candidate key is {A, B, D}.
Any dependency on a subset (e.g., A, B, or D) is a partial dependency.
From the FDs:
 A → I ❌ (partial)
 {A, B} → C ❌ (partial)
 {B, D} → E, F ❌ (partial)
 {A, D} → G, H ❌ (partial)
 H → J ✅ (not partial, H isn’t part of key)

✅ 2NF Decomposition:
Break the partial dependencies:
1. R1(A, B, D, C) – {A, B} → C
2. R2(B, D, E, F) – {B, D} → E, F
3. R3(A, D, G, H) – {A, D} → G, H
4. R4(A, I) – A → I
5. R5(H, J) – H → J

🔁 Step 3: Check for 3NF


3NF removes transitive dependencies (non-prime attribute → non-prime attribute).
Relation FD 3NF?
R1 {A, B} → C ✅
R2 {B, D} → E, F ✅
R3 {A, D} → G, H ✅
R4 A→I ✅
R5 H→J ✅
✅ All relations satisfy 3NF.

✅ Final 3NF Decomposition:


Relation Attributes Primary Key
R1 A, B, D, C A, B, D
R2 B, D, E, F B, D
R3 A, D, G, H A, D
R4 A, I A
R5 H, J H
14.26. Consider the following relation:

a. Given the previous extension (state), which of the following dependencies may hold in
the above relation? If the dependency cannot hold, explain why by specifying the tuples
that cause the violation. i. A → B, ii. B → C, iii. C → B, iv. B → A, v. C → A
b. Does the above relation have a potential candidate key? If it does, what is it? If it does
not, why not?
Let's solve Exercise 14.26 step by step.

🧩 Given Relation:
TUP# A B C
1 10 b1 c1
2 10 b2 c1
3 11 b4 b3
4 12 c1 b1
5 13 b3 c4
6 14 b3 c4

🔍 (a) Check if these dependencies hold


i. A → B
Check for same A → same B
 TUP#1 and TUP#2: A = 10 → B = b1, b2 ❌ Violation
 ✅ Does NOT hold (violated by tuples 1 & 2)

ii. B → C
Same B should imply same C
 TUP#5 and TUP#6: B = b3 → C = c4 ✅
 TUP#1: B = b1 → C = c1
 TUP#4: B = c1 → C = b1
 All others have distinct B
✅ Holds (no counterexamples found)

iii. C → B
Same C → same B
 TUP#1 and TUP#2: C = c1 → B = b1, b2 ❌ Violation
 ✅ Does NOT hold (violated by tuples 1 & 2)

iv. B → A
Same B → same A
 TUP#5: B = b3 → A = 13
 TUP#6: B = b3 → A = 14 ❌ Violation
✅ Does NOT hold (violated by tuples 5 & 6)

v. C → A
 TUP#5 and TUP#6: C = c4 → A = 13, 14 ❌ Violation
✅ Does NOT hold (violated by tuples 5 & 6)

✅ Summary of (a):
Dependency Holds? Violation Tuples
A→B ❌ TUP#1 and TUP#2
B→C ✅ None
C→B ❌ TUP#1 and TUP#2
B→A ❌ TUP#5 and TUP#6
C→A ❌ TUP#5 and TUP#6

🔍 (b) Candidate Key?


Let’s check if any attribute or combination uniquely identifies each tuple.
Try single attributes:
 A: 10 appears twice ❌
 B: b3 appears twice ❌
 C: c1 and c4 repeat ❌
Try pairs:
 (A, B)?
o (10, b1), (10, b2), (11, b4), (12, c1), (13, b3), (14, b3) – All unique ✅
 (B, C)?
o b3, c4 appears twice (TUP#5 and #6) ❌
 (A, C)?
o All unique ✅
 (B, A)?
o All unique ✅
So:
Candidate keys include:
 (A, B)
 (A, C)
 (B, A)

✅ Final Answers:
a. Which dependencies may hold?
 ✅ Only B → C holds
 ❌ All others are violated by example tuples.
b. Does it have a candidate key?
 ✅ Yes. Example candidate keys: (A, B) or (A, C) (both uniquely identify all tuples)
14.27. Consider a relation R(A, B, C, D, E) with the following dependencies: AB → C,
CD → E, DE → B Is AB a candidate key of this relation? If not, is ABD? Explain your
answer.
Let's analyze Exercise 14.27 step by step.

🔧 Given:
Relation: R(A, B, C, D, E)
Functional Dependencies:
1. AB → C
2. CD → E
3. DE → B
We are asked whether:
 AB is a candidate key, and if not,
 Whether ABD is a candidate key.

✅ Step 1: Compute closure of AB (AB⁺)


Start with:
AB⁺ = {A, B}
Using AB → C, we get:
→ Add C → AB⁺ = {A, B, C}
Now we have C, so try CD → E
→ We don’t have D, so can’t use it yet ❌
We can't apply DE → B either (need both D and E).
So final AB⁺ = {A, B, C}
Conclusion: ❌ AB is not a candidate key (does not cover all attributes)

✅ Step 2: Try ABD⁺


Start with:
ABD⁺ = {A, B, D}
Use AB → C → Add C
→ ABD⁺ = {A, B, C, D}
Now CD → E → We have C and D → Add E
→ ABD⁺ = {A, B, C, D, E}
Now have all attributes. ✅
Conclusion: ✅ ABD is a candidate key (closure includes all attributes)

✅ Final Answer:
 AB is not a candidate key, because AB⁺ = {A, B, C}
 ABD is a candidate key, because ABD⁺ = {A, B, C, D, E}
14.28. Consider the relation R, which has attributes that hold schedules of courses
and sections at a university; R = {Course_no, Sec_no, Offering_dept, Credit_hours,
Course_level, Instructor_ssn, Semester, Year, Days_hours, Room_no,
No_of_students}. Suppose that the following functional dependencies hold on R:
{Course_no} → {Offering_dept, Credit_hours, Course_level} {Course_no, Sec_no,
Semester, Year} → {Days_hours, Room_no, No_of_students, Instructor_ssn}
{Room_no, Days_hours, Semester, Year} → {Instructor_ssn, Course_no, Sec_no} Try
to determine which sets of attributes form keys of R. How would you normalize this
relation?
Let's walk through Exercise 14.28 step by step to determine:
1. 🔑 Candidate keys of relation R
2. 🧼 How to normalize R (up to 3NF or BCNF)

🧩 Given:
Relation R =
{Course_no, Sec_no, Offering_dept, Credit_hours, Course_level, Instructor_ssn,
Semester, Year, Days_hours, Room_no, No_of_students}
Functional Dependencies (FDs):
1. FD1:
Course_no → Offering_dept, Credit_hours, Course_level
2. FD2:
Course_no, Sec_no, Semester, Year → Days_hours, Room_no,
No_of_students, Instructor_ssn
3. FD3:
Room_no, Days_hours, Semester, Year → Instructor_ssn, Course_no, Sec_no

🔍 Step 1: Analyze Candidate Keys


Let’s try finding a minimal set of attributes whose closure gives us all attributes of R.

Try: (Course_no, Sec_no, Semester, Year)


(This appears in FD2’s LHS, so it’s a good start)
Closure:
Start with {Course_no, Sec_no, Semester, Year}
Using FD2:
→ Add {Days_hours, Room_no, No_of_students, Instructor_ssn}
Now we have:
 {Course_no, Sec_no, Semester, Year, Days_hours, Room_no, No_of_students,
Instructor_ssn}
Use FD1:
→ From Course_no, add: {Offering_dept, Credit_hours, Course_level}
Use FD3:
We already have Room_no, Days_hours, Semester, Year
→ Add: {Instructor_ssn (already present), Course_no (already present), Sec_no
(already present)}
So now the closure is:
{Course_no, Sec_no, Semester, Year}⁺ = All attributes in R ✅
➡️Candidate Key = {Course_no, Sec_no, Semester, Year}

🔍 Step 2: Normalize to 3NF / BCNF


Let’s now decompose the relation R into smaller relations based on the functional
dependencies, preserving dependency and being in BCNF (or at least 3NF).

✂️Decomposition based on FD1:


FD1:
Course_no → Offering_dept, Credit_hours, Course_level
This suggests we create a relation:
R1 (Course):
Course_no (PK), Offering_dept, Credit_hours, Course_level

✂️Decomposition based on FD2:


FD2:
Course_no, Sec_no, Semester, Year → Days_hours, Room_no, No_of_students,
Instructor_ssn
Create:
R2 (Section):
(Course_no, Sec_no, Semester, Year) (PK), Days_hours, Room_no, No_of_students,
Instructor_ssn

✂️FD3 introduces dependency on Room & Time:


FD3:
Room_no, Days_hours, Semester, Year → Instructor_ssn, Course_no, Sec_no
Create:
R3 (Room_Schedule):
(Room_no, Days_hours, Semester, Year) (PK), Instructor_ssn, Course_no, Sec_no

✅ Final Decomposed Schema (BCNF):


R1: Course
 Attributes: Course_no (PK), Offering_dept, Credit_hours, Course_level
 From FD1
R2: Section
 Attributes: Course_no, Sec_no, Semester, Year (composite PK), Days_hours,
Room_no, No_of_students, Instructor_ssn
 From FD2
R3: Room_Schedule
 Attributes: Room_no, Days_hours, Semester, Year (PK), Instructor_ssn, Course_no,
Sec_no
 From FD3

✨ Summary
 Candidate key of R: {Course_no, Sec_no, Semester, Year}
 Decomposed Relations (in BCNF):
o R1: Course info
o R2: Section offering info
o R3: Room schedule info
14.29. Consider the following relations for an order-processing application data base
at ABC, Inc. ORDER (O#, Odate, Cust#, Total_amount) ORDER_ITEM(O#, I#,
Qty_ordered, Total_price, Discount%) Assume that each item has a different
discount. The Total_price refers to one item, Odate is the date on which the order
was placed, and the Total_amount is the amount of the order. If we apply a natural
join on the relations ORDER_ITEM and ORDER in this database, what does the
resulting relation schema RES look like? What will be its key? Show the FDs in this
resulting relation. Is RES in 2NF? Is it in 3NF? Why or why not? (State assumptions, if
you make any.)

Let's walk through Exercise 14.29 step by step.

🧾 Given: Two base relations:

ORDER

(O#, Odate, Cust#, Total_amount)

 O# = Order Number (primary key)

ORDER_ITEM

(O#, I#, Qty_ordered, Total_price, Discount%)

 Composite primary key: (O#, I#)

🧩 Step 1: Natural Join → RES

Natural Join on O#, the common attribute:

RES (Resulting Schema):

RES = (O#, Odate, Cust#, Total_amount, I#, Qty_ordered, Total_price, Discount%)

🔑 Step 2: What is the Primary Key of RES?


Since:

 ORDER_ITEM has a composite PK: (O#, I#)


 ORDER has a PK: O#
 After join, each item in an order is preserved, and I# distinguishes them within an
order

➡️Key of RES = (O#, I#) ✅


(We still need both to identify a unique tuple.)

📐 Step 3: Functional Dependencies (FDs)

We infer the following FDs:

1. From ORDER:
O# → Odate, Cust#, Total_amount ✅
2. From ORDER_ITEM:
O#, I# → Qty_ordered, Total_price, Discount% ✅

🧼 Step 4: Is RES in 2NF?

2NF: No partial dependency of non-prime attributes on a proper subset of a candidate key.

 Candidate key: (O#, I#)


 Check if any non-key attribute is dependent only on part of the key:

From FD1:

 O# → Odate, Cust#, Total_amount


✅ These are partial dependencies (only on O#)

🚨 Violation of 2NF
So, RES is not in 2NF

🧼 Step 5: Is RES in 3NF?

We already failed 2NF ⇒ cannot be in 3NF.

But let's double-check:


1. A ∈ X (trivial)
 3NF: For every FD X → A, either:

2. X is a superkey
3. A is a prime attribute (part of some candidate key)

FD1: O# → Odate, Cust#, Total_amount

 LHS is not a superkey

🚨 Violation ⇒ Not in 3NF


 RHS attributes are not prime

🧠 Normalization Suggestion:

Decompose RES into 2NF / 3NF:

1. Order_Info
(O#, Odate, Cust#, Total_amount)
PK: O#
2. Order_Item
(O#, I#, Qty_ordered, Total_price, Discount%)
PK: (O#, I#)

✅ Now both are in 3NF (and even BCNF)

✅ Final Summary:

 RES schema = (O#, Odate, Cust#, Total_amount, I#, Qty_ordered, Total_price,


Discount%)
 Primary Key = (O#, I#)
 FDs:
o O# → Odate, Cust#, Total_amount
o O#, I# → Qty_ordered, Total_price, Discount%
 Not in 2NF or 3NF due to partial dependencies
 Can be normalized into 3NF using base relations ORDER and ORDER_ITEM

14.30. Consider the following relation: CAR_SALE(Car#, Date_sold, Salesperson#,


Commission%, Discount_amt) Assume that a car may be sold by multiple salespeople,
and hence {Car#, Salesperson#} is the primary key. Additional dependencies are
Date_sold → Discount_amt and Salesperson# → Commission% Based on the given
primary key, is this relation in 1NF, 2NF, or 3NF? Why or why not? How would you
successively normalize it completely?

Great! Let’s break down Exercise 14.30 step by step:

🔧 Given Relation:

CAR_SALE(Car#, Date_sold, Salesperson#, Commission%, Discount_amt)

 Primary Key = {Car#, Salesperson#}


(because multiple salespeople can sell the same car)

📐 Step 1: Is it in 1NF?

1NF: All attribute values must be atomic (no multivalued or nested attributes).

✅ Yes, the table appears to be in 1NF. All values are atomic.

📐 Step 2: Is it in 2NF?

2NF: Must be in 1NF + no partial dependency (i.e., no non-prime attribute depends on


part of a candidate key).

Let’s analyze:

 Primary Key = (Car#, Salesperson#)


 FDs:
o Date_sold → Discount_amt
o Salesperson# → Commission%

Check for partial dependencies:

 Date_sold → Discount_amt → Depends only on Car# (because Car#


determines Date_sold)

⇒ Partial dependency, because Discount_amt is transitively dependent on just Car#


✅ So, Car# → Date_sold → Discount_amt

 Salesperson# → Commission%
✅ Also a partial dependency, since it depends on just Salesperson#

🚨 Violates 2NF
📐 Step 3: Is it in 3NF?

Not applicable yet since it's not in 2NF.


Also, there's a transitive dependency:
Car# → Date_sold → Discount_amt
So even if we fix 2NF, we must check 3NF.

🧼 Step 4: Normalize to 2NF & 3NF (Full Normalization)

Step 1: Decompose to eliminate partial dependencies

We create three relations:

(1) CAR_SALE

 Attributes: Car#, Salesperson#


 Primary Key: (Car#, Salesperson#)
 Contains only the core relationship (who sold which car)

(2) SALE_INFO

 Attributes: Car#, Date_sold, Discount_amt


 Primary Key: Car#
 Since Car# uniquely determines Date_sold → Discount_amt

(3) SALESPERSON_INFO

 Attributes: Salesperson#, Commission%


 Primary Key: Salesperson#
 Since each salesperson has a fixed commission%

✅ Now:

 All relations are in 3NF


 All FDs are preserved
 No partial or transitive dependencies

✅ Final Answer Summary:

 Original CAR_SALE relation is in 1NF only


 ❌ Not in 2NF: due to partial dependencies on Car# and Salesperson#
 ❌ Not in 3NF: due to transitive dependency via Date_sold
 ✅ Normalize into 3NF by decomposing into:

Final 3NF Relations:

Relation Name Attributes Primary Key


CAR_SALE Car#, Salesperson# (Car#, Salesperson#)
SALE_INFO Car#, Date_sold, Car#
Discount_amt
SALESPERSON_INFO Salesperson#, Commission% Salesperson#

14.31. Consider the following relation for published books: BOOK (Book_title,
Author_name, Book_type, List_price, Author_affil, Publisher)
Author_affil refers to the affiliation of author. Suppose the following depen dencies exist:
Book_title → Publisher, Book_type
Book_type → List_price
Author_name → Author_affil
a. What normal form is the relation in? Explain your answer.
b. Apply normalization until you cannot decompose the relations further. State the
reasons behind each decomposition.
Let's work through Exercise 14.31 step by step:

Given Relation:
BOOK(Book_title, Author_name, Book_type, List_price, Author_affil, Publisher)

Given Functional Dependencies (FDs):


1. Book_title → Publisher, Book_type
2. Book_type → List_price
3. Author_name → Author_affil

Step (a): What normal form is this relation in?


🔹 Check for 1NF:
 All attribute values are atomic
✅ So the relation is in 1NF

🔹 Check for 2NF:


We need to know the Primary Key.
Assumption: Since Book_title and Author_name together identify each row (i.e., a specific
book written by a specific author), the Primary Key is: {Book_title, Author_name}
Now, let’s examine for partial dependencies:

⇒ Only a part of the primary key


1. Book_title → Publisher, Book_type

⇒ 🚨 Partial dependency

⇒ Also a partial dependency (depends on only Author_name)


2. Author_name → Author_affil

So ❌ the relation is not in 2NF.

✅ Answer to Part (a):


The relation is in 1NF but violates 2NF due to partial dependencies on Book_title and
Author_name.

Step (b): Normalize into 3NF


We'll decompose the relation step-by-step:

🔹 Step 1: Remove Partial Dependencies (2NF)


Split into multiple relations:
(1) BOOK_AUTHORS
 Attributes: Book_title, Author_name
 Primary Key: (Book_title, Author_name)
 Represents who wrote which book
(2) BOOK_INFO
 Attributes: Book_title, Publisher, Book_type
 Primary Key: Book_title
 From FD: Book_title → Publisher, Book_type
(3) AUTHOR_INFO
 Attributes: Author_name, Author_affil
 Primary Key: Author_name
 From FD: Author_name → Author_affil
Now check for any transitive dependencies.

🔹 Step 2: Remove Transitive Dependencies (3NF)


From Book_type → List_price in BOOK_INFO, we can split further:
(4) BOOK_TYPE_INFO
 Attributes: Book_type, List_price
 Primary Key: Book_type
Remove List_price from BOOK_INFO
So now we have:

✅ Final 3NF Relations:


Relation Name Attributes Primary Key
BOOK_AUTHORS Book_title, Author_name (Book_title, Author_name)
BOOK_INFO Book_title, Publisher, Book_type Book_title
AUTHOR_INFO Author_name, Author_affil Author_name
BOOK_TYPE_INFO Book_type, List_price Book_type

✅ Final Answer Summary:


a. The original relation is in 1NF only, not 2NF due to partial dependencies on Book_title
and Author_name.
b.We decomposed the relation into 4 3NF relations by removing partial and transitive
dependencies, preserving data without redundancy or anomalies.
14.32. This exercise asks you to convert business statements into dependencies.
Consider the relation DISK_DRIVE (Serial_number, Manufacturer, Model, Batch,
Capacity, Retailer). Each tuple in the relation DISK_DRIVE contains information
about a disk drive with a unique Serial_number, made by a manufacturer, with a
particular model number, released in a certain batch, which has a certain stor
age capacity and is sold by a certain retailer. For example, the tuple Disk_drive
(‘1978619’, ‘WesternDigital’, ‘A2235X’, ‘765234’, 500, ‘CompUSA’) specifies
that WesternDigital made a disk drive with serial number 1978619 and model
number A2235X, released in batch 765234; it is 500GB and sold by CompUSA.
Write each of the following dependencies as an FD:
a. The manufacturer and serial number uniquely identifies the drive.
b. A model number is registered by a manufacturer and therefore can’t be
used by another manufacturer.
c. All disk drives in a particular batch are the same model.
d. All disk drives of a certain model of a particular manufacturer have
exactly the same capacity.

Let's break down the business statements and convert them into functional dependencies
(FDs).

Relation: DISK_DRIVE (Serial_number, Manufacturer, Model, Batch, Capacity,


Retailer)

a. The manufacturer and serial number uniquely identify the drive.

This means that the combination of Manufacturer and Serial_number can uniquely
identify a specific disk drive. In terms of a functional dependency:

 Manufacturer, Serial_number → Model, Batch, Capacity, Retailer


b. A model number is registered by a manufacturer and therefore can’t be used by
another manufacturer.

This means that for each Model, there is exactly one Manufacturer. A Model cannot be
shared by two manufacturers. In FD form:

 Model → Manufacturer

c. All disk drives in a particular batch are the same model.

This means that if two disk drives share the same Batch, they must also share the same
Model. In FD form:

 Batch → Model

d. All disk drives of a certain model of a particular manufacturer have exactly the
same capacity.

This means that for a given Manufacturer and Model, all the disk drives must have the
same Capacity. In FD form:

 Manufacturer, Model → Capacity

Summary of Functional Dependencies:

1. Manufacturer, Serial_number → Model, Batch, Capacity, Retailer


2. Model → Manufacturer
3. Batch → Model
4. Manufacturer, Model → Capacity

14.33. Consider the following relation: R (Doctor#, Patient#, Date, Diagnosis, Treat_code,
Charge) In the above relation, a tuple describes a visit of a patient to a doctor along with
a treatment code and daily charge. Assume that diagnosis is determined (uniquely) for
each patient by a doctor. Assume that each treatment code has a fixed charge
(regardless of patient). Is this relation in 2NF? Justify your answer and decompose if
necessary. Then argue whether further normaliza tion to 3NF is necessary, and if so,
perform it

To determine whether the relation R(Doctor#,Patient#,Date,Diagnosis,Treat_code,Charge)R


(Doctor\#, Patient\#, Date, Diagnosis, Treat\_code,
Charge)R(Doctor#,Patient#,Date,Diagnosis,Treat_code,Charge) is in 2NF (Second Normal
Form), we need to analyze its functional dependencies (FDs) and check for partial
dependencies. Then, we will determine if further normalization to 3NF is necessary.
Step 1: Identify Functional Dependencies

Based on the information provided, we can infer the following functional dependencies:

1. Doctor#, Patient# → Diagnosis


A diagnosis is uniquely determined by the combination of the doctor and patient.
2. Treat_code → Charge
The charge for a treatment is determined by the treatment code, and the charge is
fixed for each treatment.
3. Doctor#, Patient#, Date → Treat_code
A specific doctor, patient, and date combination can determine a treatment code.
4. Doctor#, Patient#, Date → Charge
A specific doctor, patient, and date combination can determine the charge for that
visit (since the charge is fixed based on the treatment code).

Step 2: Check for 2NF Compliance

To be in 2NF, the relation must meet the following criteria:

 It must be in 1NF (i.e., no repeating groups, atomic attributes).


 There should be no partial dependencies (i.e., no non-prime attribute should depend
on a part of a candidate key).

Candidate Key(s):

 The candidate key for this relation would be the combination of Doctor#, Patient#,
Date. This combination uniquely identifies each tuple (because a specific patient sees
a doctor on a particular date, and each visit has a unique treatment code and charge).

Now, let's check for partial dependencies:

 Diagnosis depends on Doctor#, Patient# (part of the candidate key, Doctor#,


Patient#, Date). This is a partial dependency because Diagnosis depends only on a
part of the candidate key.
 Charge depends on Treat_code, but Treat_code depends on the full candidate key
(Doctor#, Patient#, Date). This is not a partial dependency.

Since there is a partial dependency (Diagnosis depends on part of the candidate key
Doctor#, Patient#), the relation is not in 2NF.

Step 3: Decompose the Relation to 2NF

We will decompose the relation to eliminate the partial dependency.

Decomposition:
1. Create a new relation for Diagnosis that depends only on Doctor#, Patient#:

R1(Doctor#,Patient#,Diagnosis)R1 (Doctor\#, Patient\#,


Diagnosis)R1(Doctor#,Patient#,Diagnosis)

The functional dependency for this relation is:


Doctor#, Patient# → Diagnosis

2. The remaining attributes will be placed in another relation that has Doctor#,
Patient#, Date as the primary key:

R2(Doctor#,Patient#,Date,Treat_code,Charge)R2 (Doctor\#, Patient\#, Date, Treat\


_code, Charge)R2(Doctor#,Patient#,Date,Treat_code,Charge)

The functional dependencies for this relation are:

o Doctor#, Patient#, Date → Treat_code


o Treat_code → Charge

Now, both relations are in 2NF because there are no partial dependencies in either of them.

Step 4: Check for 3NF

To be in 3NF, the relation must meet the following criteria:

 It must be in 2NF.
 There must be no transitive dependencies (i.e., no non-prime attribute depends on
another non-prime attribute).

Checking R1 (Doctor#, Patient#, Diagnosis):

 Doctor#, Patient# → Diagnosis


Here, Doctor# and Patient# are the candidate key, and Diagnosis is a non-prime
attribute. There is no transitive dependency, as Diagnosis depends directly on the
candidate key.

Thus, R1R1R1 is in 3NF.

Checking R2 (Doctor#, Patient#, Date, Treat_code, Charge):

 Doctor#, Patient#, Date → Treat_code


 Treat_code → Charge
Here, Charge depends on Treat_code, and Treat_code is not a candidate key but a
non-prime attribute. This is a transitive dependency because Charge depends on
Treat_code, and Treat_code is indirectly dependent on the full key Doctor#,
Patient#, Date.

To remove the transitive dependency, we can decompose R2 further:

1. Create a new relation for Treat_code and Charge:

R3(Treat_code,Charge)R3 (Treat\_code, Charge)R3(Treat_code,Charge)

The functional dependency for this relation is:


Treat_code → Charge

2. The remaining relation R2′R2'R2′ will be:

R2′(Doctor#,Patient#,Date,Treat_code)R2' (Doctor\#, Patient\#, Date, Treat\


_code)R2′(Doctor#,Patient#,Date,Treat_code)

The functional dependency for this relation is:


Doctor#, Patient#, Date → Treat_code

Now, both R2′R2'R2′ and R3R3R3 are in 3NF because there are no transitive dependencies.

Step 5: Final Decomposition

After decomposing to remove partial and transitive dependencies, we have the following
relations:

1. R1(Doctor#,Patient#,Diagnosis)R1 (Doctor\#, Patient\#,


Diagnosis)R1(Doctor#,Patient#,Diagnosis)
2. R2′(Doctor#,Patient#,Date,Treat_code)R2' (Doctor\#, Patient\#, Date, Treat\
_code)R2′(Doctor#,Patient#,Date,Treat_code)
3. R3(Treat_code,Charge)R3 (Treat\_code, Charge)R3(Treat_code,Charge)

Conclusion:

 The original relation was not in 2NF because of partial dependencies.


 After decomposing into 2NF, further normalization to 3NF was necessary because of
a transitive dependency.
 The final decomposition is in 3NF.

14.34. Consider the following relation: CAR_SALE (Car_id, Option_type, Option_listprice,


Sale_date, Option_discountedprice) This relation refers to options installed in cars (e.g.,
cruise control) that were sold at a dealership, and the list and discounted prices of the
options. If CarID → Sale_date and Option_type → Option_listprice and CarID,
Option_type → Option_discountedprice, argue using the generalized definition of the 3NF
that this relation is not in 3NF. Then argue from your knowledge of 2NF, why it is not even
in 2NF

Let's carefully analyze the relation


CAR_SALE(Car_id,Option_type,Option_listprice,Sale_date,Option_discountedprice)\
text{CAR\_SALE} (Car\_id, Option\_type, Option\_listprice, Sale\_date, Option\
_discountedprice)CAR_SALE(Car_id,Option_type,Option_listprice,Sale_date,Option_disc
ountedprice) and the functional dependencies:

Given Functional Dependencies:

1. Car_id → Sale_date
The sale date is determined by the car ID, meaning each car is associated with a
single sale date.
2. Option_type → Option_listprice
The list price of an option is determined by the type of option, meaning each option
type has a fixed list price.
3. Car_id, Option_type → Option_discountedprice
The discounted price of an option is determined by both the car ID and the option
type.

Step 1: Argument using the Generalized Definition of 3NF

The generalized definition of 3NF is as follows:

 A relation is in 3NF if for every non-trivial functional dependency X→YX \to


YX→Y, either:
1. XXX is a superkey, or
2. Every attribute in YYY is a prime attribute (i.e., an attribute that is part of a
candidate key).

Checking the Functional Dependencies for 3NF Compliance:

1. Car_id → Sale_date
o Car_id is not a superkey. The relation's primary key might be a combination
of Car_id and Option_type, but Car_id alone cannot uniquely identify a tuple
because there could be multiple options for the same car.
o Sale_date is not a prime attribute (it is not part of the primary key).
o This is a violation of 3NF, because the non-prime attribute Sale_date depends
on a non-superkey Car_id.
2. Option_type → Option_listprice
o Option_type is not a superkey. The primary key involves both Car_id and
Option_type, but Option_type alone cannot uniquely identify a tuple.
o Option_listprice is not a prime attribute.
oThis is a violation of 3NF, because Option_listprice depends on
Option_type, which is not a superkey.
3. Car_id, Option_type → Option_discountedprice
o Car_id, Option_type is the candidate key of the relation (it uniquely
identifies each tuple).
o Option_discountedprice is a non-prime attribute, but since the left-hand side
(Car_id, Option_type) is a superkey, this functional dependency satisfies the
3NF condition.

Conclusion from Generalized 3NF Definition:

The relation is not in 3NF because of the violations of 3NF in the first two functional
dependencies:

 Car_id → Sale_date (Sale_date depends on a non-superkey).


 Option_type → Option_listprice (Option_listprice depends on a non-superkey).

Step 2: Argument Based on 2NF

To be in 2NF, the relation must:

 Be in 1NF (i.e., it must not contain repeating groups or multi-valued attributes).


 Have no partial dependencies (i.e., non-prime attributes should not depend on only
part of a candidate key).

Checking for 2NF Compliance:

The candidate key for the relation is likely Car_id, Option_type (the combination of car
ID and option type uniquely identifies each tuple).

1. Car_id → Sale_date
o Sale_date depends only on part of the candidate key (Car_id). This is a
partial dependency because Car_id is a part of the candidate key, not the full
key.
o This violates 2NF, because Sale_date is a non-prime attribute that depends on
part of the candidate key.
2. Option_type → Option_listprice
o Option_listprice depends only on part of the candidate key (Option_type).
This is another partial dependency.
o This violates 2NF, because Option_listprice is a non-prime attribute that
depends on part of the candidate key.

Conclusion from 2NF:

The relation is not in 2NF because it contains partial dependencies:


 Sale_date depends on part of the candidate key (Car_id).
 Option_listprice depends on part of the candidate key (Option_type).

Step 3: Decomposing to 2NF

To bring the relation into 2NF, we need to remove the partial dependencies:

1. Create a new relation for Sale_date based on Car_id:

R1(Car_id,Sale_date)R1 (Car\_id, Sale\_date)R1(Car_id,Sale_date)

The functional dependency for this relation is:

o Car_id → Sale_date
2. Create another relation for Option_listprice based on Option_type:

R2(Option_type,Option_listprice)R2 (Option\_type, Option\


_listprice)R2(Option_type,Option_listprice)

The functional dependency for this relation is:

o Option_type → Option_listprice
3. The remaining relation for the options installed in cars, with Car_id and
Option_type as the primary key, will be:

R3(Car_id,Option_type,Option_discountedprice)R3 (Car\_id, Option\_type, Option\


_discountedprice)R3(Car_id,Option_type,Option_discountedprice)

The functional dependencies for this relation are:

o Car_id, Option_type → Option_discountedprice

Step 4: Check 3NF Compliance

After decomposing into 2NF:

 R1(Car_id,Sale_date)R1 (Car\_id, Sale\_date)R1(Car_id,Sale_date) is in 3NF, as


Car_id → Sale_date (and Car_id is a superkey).
 R2(Option_type,Option_listprice)R2 (Option\_type, Option\
_listprice)R2(Option_type,Option_listprice) is in 3NF, as Option_type →
Option_listprice (and Option_type is a superkey).
 R3(Car_id,Option_type,Option_discountedprice)R3 (Car\_id, Option\_type, Option\
_discountedprice)R3(Car_id,Option_type,Option_discountedprice) is in 3NF, as
Car_id, Option_type → Option_discountedprice (and Car_id, Option_type is a
superkey).
Conclusion:

 The original relation is not in 3NF because of violations in the functional


dependencies Car_id → Sale_date and Option_type → Option_listprice.
 The relation is also not in 2NF because of partial dependencies on the candidate
key.
 After decomposing, the relations are brought into 2NF and 3NF.

14.35. Consider the relation: BOOK (Book_Name, Author, Edition, Year) with the data:

a. Based on a common-sense understanding of the above data, what are the possible
candidate keys of this relation? b. Justify that this relation has the MVD {Book} → →
{Author} | {Edition, Year}. c. What would be the decomposition of this relation based on
the above MVD? Evaluate each resulting relation for the highest normal form it possesses.

Let's analyze the relation BOOK(Book_Name,Author,Edition,Year)\text{BOOK} (Book\


_Name, Author, Edition, Year)BOOK(Book_Name,Author,Edition,Year) and address each
part of the question systematically.

Given Data:

Book_Name Author Edition Year


DB_fundamentals Navathe 4 2004
DB_fundamentals Elmasri 4 2004
DB_fundamentals Navathe 5 2007
DB_fundamentals Elmasri 5 2007

a. What are the possible candidate keys of this relation?

A candidate key is a minimal set of attributes that can uniquely identify a tuple in the
relation. In this case, we need to determine what combination of attributes can uniquely
identify each record.

Let's analyze:
 Book_Name: Each row corresponds to a particular book. However, the same book
can have multiple authors, editions, and years, so Book_Name alone is not sufficient
to uniquely identify a row.
 Author: The same book can have different authors for different editions and years, so
Author alone is also insufficient.
 Edition and Year: A book can have multiple editions in different years. So, the
combination of Edition and Year alone is not enough.
 Book_Name and Author: These two attributes together can uniquely identify each
row because each Book_Name and Author pair corresponds to a unique
combination of Edition and Year.
 Book_Name and Edition: This combination would not uniquely identify a row, as
the same Book_Name and Edition can correspond to multiple authors.
 Book_Name and Year: Similarly, this combination also does not work because the
same Book_Name and Year can correspond to multiple authors and editions.

Thus, the possible candidate key for this relation is:

 Book_Name, Author

b. Justify that this relation has the MVD {Book}→→{Author}∣{Edition,Year}\{ \


text{Book} \} \to\to \{ \text{Author} \} | \{ \text{Edition}, \text{Year} \}
{Book}→→{Author}∣{Edition,Year}.

A Multivalued Dependency (MVD) X→→YX \to\to YX→→Y indicates that for every
value of XXX, the set of values for YYY is independent of the set of values for the rest of
the attributes in the relation. In other words, if two tuples have the same value for XXX, the
values of YYY can be freely combined with the other attributes without violating any
constraints.

Justification for the MVD {Book_Name}→→{Author}∣{Edition,Year}\{ \text{Book\


_Name} \} \to\to \{ \text{Author} \} | \{ \text{Edition}, \text{Year} \}
{Book_Name}→→{Author}∣{Edition,Year}:

 Book_Name determines Author and Edition, Year independently. For any given
Book_Name, the Author(s) can vary independently of the specific Edition and
Year, and vice versa.
 This means that for a given Book_Name, there can be multiple authors (e.g., Navathe
and Elmasri for "DB_fundamentals") and each author can have multiple editions in
different years. The set of Authors is independent of the combinations of Edition
and Year for that Book_Name.
 For example, "DB_fundamentals" is associated with both Navathe and Elmasri,
each having editions 4 and 5 in the years 2004 and 2007, respectively. The Authors
are independent of the Edition and Year. Therefore, the MVD
{Book_Name}→→{Author}∣{Edition,Year}\{ \text{Book\_Name} \} \to\to \{ \
text{Author} \} | \{ \text{Edition}, \text{Year} \}
{Book_Name}→→{Author}∣{Edition,Year} holds.

c. What would be the decomposition of this relation based on the above MVD?
Evaluate each resulting relation for the highest normal form it possesses.

To decompose the relation based on the MVD


{Book_Name}→→{Author}∣{Edition,Year}\{ \text{Book\_Name} \} \to\to \{ \
text{Author} \} | \{ \text{Edition}, \text{Year} \}
{Book_Name}→→{Author}∣{Edition,Year}, we can split the relation into two relations:

1. First Relation: Author Information

This relation will capture the MVD's left-hand side (Book_Name) and its dependent set of
attributes (Author):

R1(Book_Name,Author)R1 (\text{Book\_Name}, \text{Author})R1(Book_Name,Author)

 Book_Name → Author (Each book can have multiple authors).


 This relation is in 4NF, as it does not contain any non-trivial MVDs violating 4NF.

2. Second Relation: Edition and Year Information

This relation will capture the right-hand side of the MVD, with the set of attributes
dependent on the Book_Name:

R2(Book_Name,Edition,Year)R2 (\text{Book\_Name}, \text{Edition}, \


text{Year})R2(Book_Name,Edition,Year)

 Book_Name, Edition, Year uniquely identifies each tuple.


 This relation is in 3NF (because it is free from transitive dependencies and the
candidate key is a superkey).

Evaluation of Normal Forms:

 R1 (Book_Name, Author) is in 4NF, as there are no multi-valued dependencies


violating 4NF, and there is a candidate key.
 R2 (Book_Name, Edition, Year) is in 3NF, because:
o It is in 2NF (it is in 1NF and has no partial dependencies).
o It has no transitive dependencies, and the candidate key (Book_Name,
Edition, Year) is a superkey.

Conclusion:

 The decomposition results in two relations: R1R1R1 and R2R2R2.


 R1R1R1 is in 4NF.
 R2R2R2 is in 3NF.

This decomposition satisfies the given MVD and results in relations that are in the highest
normal forms.

14.36. Consider the following relation: TRIP (Trip_id, Start_date, Cities_visited,


Cards_used) This relation refers to business trips made by company salespeople.
Suppose the TRIP has a single Start_date but involves many Cities and salespeople may
use multiple credit cards on the trip. Make up a mock-up population of the table. a.
Discuss what FDs and/or MVDs exist in this relation. b. Show how you will go about
normalizing the relation.Bottom of Form

Let's first consider the structure and the potential dependencies of the relation
TRIP(Trip_id,Start_date,Cities_visited,Cards_used)\text{TRIP} (Trip\_id, Start\_date,
Cities\_visited, Cards\_used)TRIP(Trip_id,Start_date,Cities_visited,Cards_used).

Given Relation:

 Trip_id: A unique identifier for each trip.


 Start_date: The date the trip started.
 Cities_visited: The list of cities visited during the trip.
 Cards_used: The credit cards used during the trip.

Mock-up Population of the Table:

Trip_id Start_date Cities_visited Cards_used


1 2025-04-01 New York, Boston Visa, MasterCard
2 2025-04-02 Chicago, San Francisco Discover, Amex
3 2025-04-03 Boston, Chicago Visa, MasterCard, Amex
4 2025-04-04 Miami, Atlanta MasterCard

In this mock-up:

 Trip_id uniquely identifies each trip.


 A trip has a single Start_date, but can involve multiple Cities_visited and multiple
Cards_used.
 Each Trip_id can have a different combination of cities and cards.

a. Discuss what FDs and/or MVDs exist in this relation.


Functional Dependencies (FDs):

1. Trip_id → Start_date:
o A Trip_id uniquely determines the Start_date. Each trip has one and only one
start date.
2. Trip_id → Cities_visited:
o A Trip_id determines the set of Cities_visited because a trip will have a
specific set of cities visited (though the cities are listed together in one field,
we assume they can be derived or treated as a set from Trip_id).
3. Trip_id → Cards_used:
o Similarly, a Trip_id determines the Cards_used on that trip. A trip will have a
specific set of credit cards used, but the exact cards can be uniquely identified
by Trip_id.

Thus, the main FDs are:

 Trip_id → Start_date
 Trip_id → Cities_visited
 Trip_id → Cards_used

Multivalued Dependencies (MVDs):

 Trip_id →→ Cities_visited:
o A Trip_id can involve multiple Cities_visited, and the cities are independent
of other attributes in the relation (like Cards_used). So, a Trip_id can
determine a set of cities independently of the credit cards used.
 Trip_id →→ Cards_used:
o Similarly, a Trip_id can involve multiple Cards_used, and the credit cards are
independent of other attributes in the relation (like Cities_visited). So, a
Trip_id can determine a set of cards used independently of the cities visited.

Thus, the MVDs are:

 Trip_id →→ Cities_visited
 Trip_id →→ Cards_used

b. Show how you will go about normalizing the relation.

Let's go step by step to normalize the relation based on the above FDs and MVDs:

Step 1: Ensure 1NF (First Normal Form)


To bring the relation into 1NF, we need to ensure that there are no multi-valued attributes.
In the current relation, Cities_visited and Cards_used are multi-valued attributes, meaning
a trip can involve multiple cities and multiple credit cards.

To convert it into 1NF, we need to decompose these multi-valued attributes so that each
attribute holds only a single value in each tuple.

 Create separate tuples for each city and each card used in the trip.

Revised Relation in 1NF:

Trip_id Start_date City_visited Card_used


1 2025-04-01 New York Visa
1 2025-04-01 Boston MasterCard
2 2025-04-02 Chicago Discover
2 2025-04-02 San Francisco Amex
3 2025-04-03 Boston Visa
3 2025-04-03 Chicago MasterCard
3 2025-04-03 Chicago Amex
4 2025-04-04 Miami MasterCard
4 2025-04-04 Atlanta MasterCard

Now, the relation is in 1NF, as there are no multi-valued attributes.

Step 2: Ensure 2NF (Second Normal Form)

To ensure 2NF, the relation must be in 1NF, and there must be no partial dependencies. A
partial dependency exists if a non-prime attribute is functionally dependent on part of a
composite key.

The candidate key for this relation is Trip_id, as it uniquely identifies each tuple (after
converting to 1NF). Since Trip_id is the primary key, there are no partial dependencies in
this relation.

Thus, the relation is already in 2NF.

Step 3: Ensure 3NF (Third Normal Form)

To ensure 3NF, the relation must be in 2NF, and there should be no transitive
dependencies. A transitive dependency occurs when a non-prime attribute depends on
another non-prime attribute.

Let's look at the FDs again:


 Trip_id → Start_date
 Trip_id → Cities_visited
 Trip_id → Cards_used

Here, the non-prime attributes Start_date, Cities_visited, and Cards_used depend directly
on Trip_id (the primary key), and there are no transitive dependencies (i.e., no non-prime
attribute depends on another non-prime attribute).

Thus, the relation is already in 3NF.

Step 4: Ensure 4NF (Fourth Normal Form)

To ensure 4NF, the relation must be in 3NF, and there should be no multi-valued
dependencies that violate 4NF. An MVD violates 4NF if a non-trivial MVD exists where
the left-hand side is not a superkey.

We identified the following MVDs:

 Trip_id →→ Cities_visited
 Trip_id →→ Cards_used

Both Cities_visited and Cards_used are independent of each other and determined by
Trip_id, so there is a multi-valued dependency. To bring the relation into 4NF, we need
to decompose it based on these MVDs.

Decompose into two relations:

1. R1(Trip_id,City_visited)R1 (\text{Trip\_id}, \text{City\


_visited})R1(Trip_id,City_visited)
2. R2(Trip_id,Card_used)R2 (\text{Trip\_id}, \text{Card\
_used})R2(Trip_id,Card_used)

Resulting Relations:

 R1 (Trip_id, City_visited):
o Trip_id →→ City_visited (This relation holds a multi-valued dependency,
but no further decompositions are required).
o This relation is in 4NF because there are no non-trivial MVDs violating 4NF.
 R2 (Trip_id, Card_used):
o Trip_id →→ Card_used (This relation holds a multi-valued dependency, but
no further decompositions are required).
o This relation is also in 4NF.

Conclusion:
 The original relation is in 1NF and 2NF but violates 3NF due to multi-valued
dependencies.
 After decomposing based on the MVDs, we achieve 4NF, where the multi-valued
dependencies are removed.

You might also like