0% found this document useful (0 votes)
55 views31 pages

Infosys-Normalisation-2nd 3rd BCNF

Uploaded by

Raj Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views31 pages

Infosys-Normalisation-2nd 3rd BCNF

Uploaded by

Raj Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Functional dependency

• In a given relation R, X and Y are attributes. Attribute Y is functionally


dependent on attribute X if each value of X determines EXACTLY ONE
value of Y, which is represented as X -> Y (X can be composite in nature).

• We say here “x determines y” or “y is functionally dependent on x”


X→Y does not imply Y→X

• If the value of an attribute “Marks” is known then the value of an attribute


“Grade” is determined since Marks→ Grade

• Types of functional dependencies:

– Full Functional dependency


– Partial Functional dependency
– Transitive dependency

ER/CORP/CRS/DB07/003
Copyright © 2004, 34
Infosys Technologies Ltd Version No: 2.0

34
Functional Dependencies
Consider the following Relation

REPORT (STUDENT#,COURSE#, CourseName, IName, Room#, Marks,


Grade)

• STUDENT# - Student Number


• COURSE# - Course Number
• CourseName - Course Name
• IName - Name of the Instructor who delivered the course
• Room# - Room number which is assigned to respective Instructor
• Marks - Scored in Course COURSE# by Student STUDENT#
• Grade - obtained by Student STUDENT# in Course COURSE#

ER/CORP/CRS/DB07/003
Copyright © 2004, 35
Infosys Technologies Ltd Version No: 2.0

35
Functional Dependencies- From the previous
example

• STUDENT# COURSE# Æ Marks

• COURSE# Æ CourseName,

• COURSE# Æ IName (Assuming one course is taught by one and only one
Instructor)

• IName Æ Room# (Assuming each Instructor has his/her own and non-
shared room)

• Marks Æ Grade

ER/CORP/CRS/DB07/003
Copyright © 2004, 36
Infosys Technologies Ltd Version No: 2.0

36
Dependency diagram
Report( S#,C#,SName,CTitle,LName,Room#,Marks,Grade)
• S# Æ SName
S# C#
• C# Æ CTitle,
• C# Æ LName
• LName Æ Room# CTitle
• C# Æ Room# SName
• S# C# Æ Marks LName
• Marks Æ Grade
• S# C# Æ Grade Marks Grade
Room#

Assumptions:
• Each course has only one lecturer and each lecturer has a room.
• Grade is determined from Marks.

ER/CORP/CRS/DB07/003
Copyright © 2004, 37
Infosys Technologies Ltd Version No: 2.0

37
Full dependencies

X and Y are attributes.


X Functionally determines Y
Note: Subset of X should not functionally determine Y

ER/CORP/CRS/DB07/003
Copyright © 2004, 38
Infosys Technologies Ltd Version No: 2.0

In above example Marks is fully functionally dependent on STUDENT# COURSE# and not on sub
set of STUDENT# COURSE#. This means Marks can not be determined either by STUDENT# OR
COURSE# alone. It can be determined only using STUDENT# AND COURSE# together. Hence
Marks is fully functionally dependent on STUDENT# COURSE#.
CourseName is not fully functionally dependent on STUDENT# COURSE# because subset of
STUDENT# COURSE# i.e only COURSE# determines the CourseName and STUDENT# does not
have any role in deciding CourseName. Hence CourseName is not fully functionally dependent on
STUDENT# COURSE#.

38
Partial dependencies
X and Y are attributes.
Attribute Y is partially dependent on the attribute X only if it is dependent
on a sub-set of attribute X.

ER/CORP/CRS/DB07/003
Copyright © 2004, 39
Infosys Technologies Ltd Version No: 2.0

In the above relationship CourseName, IName, Room# are partially dependent on composite
attributes STUDENT# COURSE# because COURSE# alone defines the CourseName, IName,
Room#.

39
Transitive dependencies

X Y and Z are three attributes.


X -> Y
Y-> Z
=> X -> Z

ER/CORP/CRS/DB07/003
Copyright © 2004, 40
Infosys Technologies Ltd Version No: 2.0

In above example, Room# depends on IName and in turn IName depends on COURSE#. Hence
Room# transitively depends on COURSE#.
Similarly Grade depends on Marks, in turn Marks depends on STUDENT# COURSE# hence Grade
depends Fully transitively on STUDENT# COURSE#.

Transitive: Indirect

40
First normal form: 1NF
• A relation schema is in 1NF :

– if and only if all the attributes of the relation R are atomic in nature.

– Atomic: the smallest level to which data may be broken down and remain
meaningful

ER/CORP/CRS/DB07/003
Copyright © 2004, 41
Infosys Technologies Ltd Version No: 2.0

In relational database design it is not practically possible to have a table which is not in 1NF.

41
Student_Course_Result Table

Student_Details Course_Details Results


101 Davis 11/4/1986 M4 Applied Mathematics Basic Mathematics 7 11/11/2004 82 A

102 Daniel 11/6/1987 M4 Applied Mathematics Basic Mathematics 7 11/11/2004 62 C

101 Davis 11/4/1986 H6 American History 4 11/22/2004 79 B

103 Sandra 10/2/1988 C3 Bio Chemistry Basic Chemistry 11 11/16/2004 65 B

104 Evelyn 2/22/1986 B3 Botany 8 11/26/2004 77 B

102 Daniel 11/6/1987 P3 Nuclear Physics Basic Physics 13 11/12/2004 68 B

105 Susan 8/31/1985 P3 Nuclear Physics Basic Physics 13 11/12/2004 89 A

103 Sandra 10/2/1988 B4 Zoology 5 11/27/2004 54 D

105 Susan 8/31/1985 H6 American History 4 11/22/2004 87 A

104 Evelyn 2/22/1986 M4 Applied Mathematics Basic Mathematics 7 11/11/2004 65 B

ER/CORP/CRS/DB07/003
Copyright © 2004, 42
Infosys Technologies Ltd Version No: 2.0

42
Table in 1NF Student_Course_Result Table
Student# Student Dateof Cour CourseName Pre Dura DateOf Marks Grade
Name Birth s Requisite t Exam
e i
# o
n
InDa
y
s

Applied
101 Davis 04-Nov-1986 M4 Mathematics Basic Mathematics 7 11-Nov-2004 82 A

Applied
102 Daniel 06-Nov-1986 M4 Mathematics Basic Mathematics 7 11-Nov-2004 62 C

101 Davis 04-Nov-1986 H6 American History 4 22-Nov-2004 79 B

103 Sandra 02-Oct-1988 C3 Bio Chemistry Basic Chemistry 11 16-Nov-2004 65 B

104 Evelyn 22-Feb-1986 B3 Botany 8 26-Nov-2004 77 B

102 Daniel 06-Nov-1986 P3 Nuclear Physics Basic Physics 13 12-Nov-2004 68 B

105 Susan 31-Aug-1985 P3 Nuclear Physics Basic Physics 13 12-Nov-2004 89 A

103 Sandra 02-Oct-1988 B4 Zoology 5 27-Nov-2004 54 D

105 Susan 31-Aug-1985 H6 American History 4 22-Nov-2004 87 A

Applied
104 Evelyn 22-Feb-1986 M4 Mathematics Basic Mathematics 7 11-Nov-2004 65 B

ER/CORP/CRS/DB07/003
Copyright © 2004, 43
Infosys Technologies Ltd Version No: 2.0

43
Second normal form: 2NF
• A Relation is said to be in Second Normal Form if and only if :
– It is in the First normal form, and
– No partial dependency exists between non-key attributes and key
attributes.

• An attribute of a relation R that belongs to any key of R is said to be a


prime attribute and that which doesn’t is a non-prime attribute

To make a table 2NF compliant, we have to remove all the partial dependencies

Note : - All partial dependencies are eliminated

ER/CORP/CRS/DB07/003
Copyright © 2004, 44
Infosys Technologies Ltd Version No: 2.0

44
Prime Vs Non-Prime Attributes
• An attribute of a relation R that belongs to any key of R is said to be a prime
attribute and that which doesn’t is a non-prime attribute
Report(S#,C#,StudentName,DateOfBirth,CourseName,PreRequisite,DurationInDays,Dat
eOfExam,Marks,Grade)
Student #
Is a PRIME Attribute
Course #

Student Name
Date of Birth
CourseName
Prerequisite Is NON-PRIME Attribute
Marks
Grade
DurationInDays
DateOfExam

ER/CORP/CRS/DB07/003
Copyright © 2004, 45
Infosys Technologies Ltd Version No: 2.0

45
Second Normal Form
• STUDENT# is key attribute for Student,
• COURSE# is key attribute for Course
• STUDENT# COURSE# together form the composite key attributes for Results
relationship.
• Other attributes like StudentName (Student Name), DateofBirth, CourseName,
PreRequisite, DurationInDays, DateofExam, Marks and Grade are non-key
attributes.

To make this table 2NF compliant, we have to remove all the partial
dependencies.

Student #, Course# -> Marks, Grade


Student# -> StudentName, DOB,
Course# -> CourseName, Prerequiste, DurationInDays
Course# -> Date of Exam

ER/CORP/CRS/DB07/003
Copyright © 2004, 46
Infosys Technologies Ltd Version No: 2.0

46
Second Normal Form

S#,C# Marks Fully Functionally


dependent on composite
S#,C# Grade Candidate key

S# StudentName
Partial Dependency
S# DOB

C# CourseName

C# Prerequisite Partial Dependency


C# Duration

C# DateOfExam Partial Dependency


ER/CORP/CRS/DB07/003
Copyright © 2004, 47
Infosys Technologies Ltd Version No: 2.0

47
Second Normal Form - Tables in 2 NF
STUDENT TABLE COURSE TABLE
Student# StudentName DateofBirth Course# Course Pre Duration
Name Requisite InDays

101 Davis 04-Nov-1986


M1 Basic Mathematics 11
102 Daniel 06-Nov-1987
M4 Applied Mathematics M1 7
103 Sandra 02-Oct-1988
H6 American History 4
104 Evelyn 22-Feb-1986
C1 Basic Chemistry 5
105 Susan 31-Aug-1985
C3 Bio Chemistry C1 11
106 Mike 04-Feb-1987
B3 Botany 8
107 Juliet 09-Nov-1986
P1 Basic Physics 8
108 Tom 07-Oct-1986
P3 Nuclear Physics P1 13
109 Catherine 06-Jun-1984 B4 Zoology 5

ER/CORP/CRS/DB07/003
Copyright © 2004, 48
Infosys Technologies Ltd Version No: 2.0

Let us re-visit our 1NF table structure.


STUDENT# is key attribute for Student,
COURSE# is key attribute for Course
STUDENT# COURSE# together form the composite key attributes for Results relationship.
Other attributes like StudentName (Student Name), DateofBirth, CourseName, PreRequisite,
DurationInDays, DateofExam, Marks and Grade are non-key attributes.
To make this table 2NF compliant, we have to remove all the partial dependencies.
StudentName, DateofBirth, Address depends only on STUDENT#
CourseName, PreRequisite, DurationInDays depends only on COURSE#
DateofExam depends only on COURSE#
Marks and Grade depends on STUDENT# COURSE#
To remove this partial dependency we can create four separate tables, Student, Course and Result
Exam_Date tables as shown below.
In the first table (STUDENT), the key attribute is STUDENT# and all other non-key attributes are fully
functionally dependant on the key attributes.
In the second table (COURSE), COURSE# is the key attribute and all the non-key attributes are fully
functionally dependant on the key attributes.
In third table (Result) STUDENT# COURSE# together are key attributes and all other non key
attributes Marks and Grade fully functionally dependant on the key attributes.
In the fourth table (Exam_Date), DateOfExam depends only on Course#.
These four tables also are compliant with First Normal Form definition. Hence these four tables are
in Second Normal Form (2NF).

48
Second Normal form – Tables in 2 NF

Student# Course# Marks Grade

101 M4 82 A
102 M4 62 C
101 H6 79 B
103 C3 65 B
104 B3 77 B
102 P3 68 B
105 P3 89 A
103 B4 54 D
105 H6 87 A
104 M4 65 B

ER/CORP/CRS/DB07/003
Copyright © 2004, 49
Infosys Technologies Ltd Version No: 2.0

49
Second Normal form – Tables in 2 NF

Exam_Date Table
Course# DateOfExam
M4 11-Nov-04

H6 22-Nov-04

C3 16-Nov-04

B3 26-Nov-04

P3 12-Nov-04

B4 27-Nov-04

ER/CORP/CRS/DB07/003
Copyright © 2004, 50
Infosys Technologies Ltd Version No: 2.0

50
Third normal form: 3 NF
A relation R is said to be in the Third Normal Form (3NF) if and only if
− It is in 2NF and
− No transitive dependency exists between non-key attributes and
key attributes.

• STUDENT# and COURSE# are the key


attributes.
S#,C# Marks
• All other attributes, except grade are non-
partially, non-transitively
S#,C# Grade
dependent on key attributes.

• Student#, Course# - > Marks


• Marks -> Grade S#,C# Marks Grade

Note : - All transitive dependencies are eliminated

ER/CORP/CRS/DB07/003
Copyright © 2004, 51
Infosys Technologies Ltd Version No: 2.0

51
3NF Tables
Student# Course# Marks

101 M4 82
102 M4 62
101 H6 79
103 C3 65
104 B3 77
102 P3 68
105 P3 89
103 B4 54
105 H6 87
104 M4 65

ER/CORP/CRS/DB07/003
Copyright © 2004, 52
Infosys Technologies Ltd Version No: 2.0

52
Third Normal Form – Tables in 3 NF
MARKSGRADE TABLE
UpperBound LowerBound Grade

100 95 A+

94 85 A

84 70 B

69 65 B-

64 55 C

54 45 D

44 0 E

ER/CORP/CRS/DB07/003
Copyright © 2004, 53
Infosys Technologies Ltd Version No: 2.0

53
Boyce-Codd Normal form - BCNF

A relation is said to be in Boyce Codd Normal Form (BCNF)


- if and only if all the determinants are candidate keys.

BCNF relation is a strong 3NF, but not every 3NF relation is BCNF.

ER/CORP/CRS/DB07/003
Copyright © 2004, 54
Infosys Technologies Ltd Version No: 2.0

A relation is said to be in Boyce Codd Normal Form (BCNF) if and only if all the determinants are
candidate keys. BCNF relation is a strong 3NF, but not every 3NF relation is BCNF.
Let us understand this concept using slightly different RESULT table structure. In the above table, we
have two candidate keys namely
STUDENT# COURSE# and COURSE# EmaiIId.
COURSE# is overlapping among those candidate keys.
Hence these candidate keys are called as
“Overlapping Candidate Keys”.
The non-key attributes Marks is non-transitively and fully functionally dependant on key attributes.
Hence this is in 3NF. But this is not in BCNF because there are four determinants in this relation
namely:
STUDENT# (STUDENT# decides EmailiD)
EMailID (EmailID decides STUDENT#)
STUDENT# COURSE# (decides Marks)
COURSE# EMailID (decides Marks).
All of them are not candidate keys. Only combination of STUDENT# COURSE# and COURSE#
EMailID are candidate keys.

54
Consider this Result Table

Student# EmailID Course# Marks

101 [email protected] M4 82
102 [email protected] M4 62
101 [email protected] H6 79
103 [email protected] C3 65
104 [email protected] B3 77
102 [email protected] P3 68
105 [email protected] P3 89
103 [email protected] B4 54
105 [email protected] H6 87
104 [email protected] M4 65

ER/CORP/CRS/DB07/003
Copyright © 2004, 55
Infosys Technologies Ltd Version No: 2.0

55
BCNF
Candidate Keys for the relation are

S# C# and C# EmailID

Since Course # is overlapping, it is referred as Overlapping Candidate Key.

S# C#
C#
C# EmailID

Valid Functional Dependendencies are

S# EMailID ( Non Key Determinant )

EMailID S# ( Non Key Determinant )

S#,C# Marks

C#,EMailID Marks

ER/CORP/CRS/DB07/003
Copyright © 2004, 56
Infosys Technologies Ltd Version No: 2.0

56
BCNF

STUDENT TABLE
Student# EmailID

101 [email protected]

102 [email protected]

103 [email protected]

104 [email protected]

105 [email protected]

ER/CORP/CRS/DB07/003
Copyright © 2004, 57
Infosys Technologies Ltd Version No: 2.0

Now both the tables are not only in 3NF, but also in BCNF because all the determinants are
Candidate keys. In the first table, STUDENT# decides EMailID and EMailID decides STUDENT# and
both are candidate keys. In second table, STUDENT# COURSE# decides all other non-key attributes
and they are composite candidate key as well as determinants.
Note: If the table has a single attribute as candidate key or no overlapping candidate keys and if it is
in 3NF, then definitely the table will also be in BCNF.
Basically BCNF takes away the redundancy, anomalies which exist among the key attributes. At
Infosys, we rarely (around 1% of database design) normalize the databases to BCNF.

57
BCNF Tables
Student# Course# Marks

101 M4 82
102 M4 62
101 H6 79
103 C3 65
104 B3 77
102 P3 68
105 P3 89
103 B4 54
105 H6 87
104 M4 65

ER/CORP/CRS/DB07/003
Copyright © 2004, 58
Infosys Technologies Ltd Version No: 2.0

58
Merits of Normalization

• Normalization is based on a mathematical foundation.

• Removes the redundancy to a greater extent. After 3NF, data redundancy is

minimized to the extent of foreign keys.

• Removes the anomalies present in INSERTs, UPDATEs and DELETEs.

ER/CORP/CRS/DB07/003
Copyright © 2004, 59
Infosys Technologies Ltd Version No: 2.0

59
Demerits of Normalization

• Data retrieval or SELECT operation performance will be severely affected.

• Normalization might not always represent real world scenarios.

ER/CORP/CRS/DB07/003
Copyright © 2004, 60
Infosys Technologies Ltd Version No: 2.0

60
Summary of Normal Forms

Input Operation Output

Un-normalized Create separate rows or columns for


Table in 1 NF
Table every combination of multivalued columns

Table in 1 NF Eliminate Partial dependencies Tables in 2NF

Tables in 2 NF Eliminate Transitive dependencies Tables in 3 NF

Eliminate Overlapping candidate key Tables in


Tables in 3 NF
columns BCNF

ER/CORP/CRS/DB07/003
Copyright © 2004, 61
Infosys Technologies Ltd Version No: 2.0

61
Points to Remember:

Normal Form Test Remedy (Normalization)


1NF Relation should have atomic Form new relations for each non-atomic
attributes. The domain of an attribute
attribute must include only
atomic (simple, indivisible)
values.
2NF For relations where primary key Decompose and form a new relation for
contains multiple attributes each partial key with its dependent
(composite primary key), non- attribute(s). Retain the relation with the
key attribute should not be original primary key and any attributes
functionally dependent on a part that are fully functionally dependent on
of the primary key. it.

3NF Relation should not have a non- Decompose and form a relation that
key attribute functionally includes the non-key attribute(s) that
determined by another non-key functionally determine(s) other non-key
attribute (or by a set of non-key attribute(s).
attributes). In other words there
should be no transitive
dependency of a non-key
attribute on the primary key.

ER/CORP/CRS/DB07/003
Copyright © 2004, 62
Infosys Technologies Ltd Version No: 2.0

62
Summary
• Normalization is a refinement process. It helps in removing anomalies present
in INSERTs/UPDATEs/DELETEs

• Normalization is also called “Bottom-up approach”, because this technique


requires very minute details like every participating attribute and how it is
dependant on the key attributes, is crucial. If you add new attributes after
normalization, it may change the normal form itself.

• There are four normal forms that were defined being commonly used.

• 1NF makes sure that all the attributes are atomic in nature.

• 2NF removes the partial dependency.

ER/CORP/CRS/DB07/003
Copyright © 2004, 63
Infosys Technologies Ltd Version No: 2.0

63
Summary – contd.
• 3NF removes the transitive dependency.

• BCNF removes dependency among key attributes.

• Too much of normalization adversely affects SELECT or RETRIEVAL


operations.

• It is always better to normalize to 3NF for INSERT, UPDATE and DELETE


intensive (On-line transaction) systems.

• It is always better to restrict to 2NF for SELECT intensive (Reporting) systems.

• While normalizing, use common sense and don’t use the normal forms as
absolute measures.

ER/CORP/CRS/DB07/003
Copyright © 2004, 64
Infosys Technologies Ltd Version No: 2.0

64

You might also like