0% found this document useful (0 votes)
31 views86 pages

Chapter 4-Functional Dependancy and Normalization

Uploaded by

Berihun Tsegaye
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views86 pages

Chapter 4-Functional Dependancy and Normalization

Uploaded by

Berihun Tsegaye
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 86

ASSOSA UNIVERSITY

College of Computing and Informatics


DEPARTMENT OF COMPUTER SCIENCE

Fundamental Database System


Chapter 4: Functional Dependencies and Normalization

DBMS- Fundamental Database System 1


Outline
1 Informal Design Guidelines for Relational Databases
Semantics of the Relation Attributes
Redundant Information in Tuples and Update Anomalies
Null Values in Tuples
Spurious Tuples
2 Functional Dependencies (FDs)
Definition of FD
Inference Rules for FDs
Equivalence of Sets of FDs
Minimal Sets of FDs
Outline(contd.)
3 Normal Forms Based on Primary Keys
 Normalization of Relations
 Practical Use of Normal Forms
 Definitions of Keys and Attributes Participating in Keys
 First Normal Form
 Second Normal Form
 Third Normal Form
 BCNF (Boyce-Codd Normal Form)
 Fourth Normal Form
Why normalization?
 Formal measure of why one grouping of attributes into a
relation schema may be better than the other.
 In this chapter the theory behind the evaluation of relational
schemas for design quality is discussed.
 In other words, to measure formally why one set of groupings
of attributes into relation schemas is better than another.
 Two levels of relation schemas
 The logical or conceptual view
 How users interpret the relation schemas and the meaning of their
attributes.
 Implementation or storage view
 How the tuples in the base relation are stored and updated.
 The storage "base relation" level
Informal Design Guidelines for Relational
Databases

 What is relational database design?


The grouping of attributes to form "good" relation schemas
 Two levels of relation schemas
 The logical "user view" level
 The storage "base relation" level
 Design is concerned mainly with base relations
 What are the criteria for "good" base relations?
Semantics of the Relation Attributes
GUIDELINE 1: Informally, each tuple in a relation should represent one
entity or relationship instance. (Applies to individual relations and their
attributes).
 Attributes of different entities (EMPLOYEEs, DEPARTMENTs,
PROJECTs) should not be mixed in the same relation
 Only foreign keys should be used to refer to other entities
 Entity and relationship attributes should be kept apart as much as
possible.

Bottom Line: Design a schema that can be explained easily relation by


relation. The semantics of attributes should be easy to interpret.
A COMPANY relational database schema
A sample relational database state corresponding to
COMPANY database
A simplified COMPANY relational database schema
Redundant Information in Tuples and
Update Anomalies

 Mixing attributes of multiple entities may cause problems


 Information is stored redundantly wasting storage
 Problems with update anomalies
 Insertion anomalies
 Deletion anomalies
 Modification anomalies
EXAMPLE OF AN UPDATE ANOMALY
Consider the relation:
 EMP_PROJ ( Emp#, Proj#, Ename, Pname, No_hours)
 Update Anomaly: Changing the name of project number P1 from
“Billing” to “Customer-Accounting” may cause this update to be
made for all 100 employees working on project P1.
EXAMPLE OF AN UPDATE ANOMALY
 Insert Anomaly: Cannot insert a project unless an employee is
assigned to .

Inversely - Cannot insert an employee unless an he/she is


assigned to a project.
 Delete Anomaly: When a project is deleted, it will result in
deleting all the employees who work on that project.
Alternately, if an employee is the sole employee on a project,
deleting that employee would result in deleting the
corresponding project.
Two relation schemas suffering from update anomalies
Example States for EMP_DEPT and EMP_PROJ
Guideline to Redundant Information in Tuples
and Update Anomalies

 GUIDELINE 2: Design a schema that does not suffer from the


insertion, deletion and update anomalies.
 If there are any present, then note them so that applications can
be made to take them into account
Problems with Null Values
 If many attributes are grouped together as a fat relation, it gives
rise to many nulls in the tuples.
 Waste storage
 Problems in understanding the meaning of the attributes
 Difficult while using Nulls in aggregate operators like count or
sum
Null Values in Tuples
GUIDELINE 3: Relations should be designed such that their tuples
will have as few NULL values as possible
 Attributes that are NULL frequently could be placed in separate
relations (with the primary key)
 Reasons for nulls:
 attribute not applicable or invalid
 attribute value unknown (may exist)
 value known to exist, but unavailable
Spurious Tuples
 Bad designs for a relational database may result in erroneous
results for certain JOIN operations
 The "lossless join" property is used to guarantee meaningful
results for join operations

GUIDELINE 4: The relations should be designed to satisfy the


lossless join condition. No spurious tuples should be generated
by doing a natural-join of any relations.
Example of Spurious Tuples
Example of Spurious Tuples generation
Generation of spurious tuples
 The two relations EMP_PROJ1 and EMP_LOCS as the base relations
of EMP_PROJ, is not a good schema design.
 Problem is if a Natural Join is performed on the above two relations it
produces more tuples than original set of tuples in EMP_PROJ.
 These additional tuples that were not in EMP_PROJ are called spurious
tuples because they represent spurious or wrong information that is not
valid.
 This is because the PLOCATION attribute which is used for joining is
neither a primary key, nor a foreign key in either EMP_LOCS AND
EMP_PROJ1.
Spurious Tuples (2)
There are two important properties of decompositions:

I. non-additive or losslessness of the corresponding join

II. preservation of the functional dependencies.

Note that property (a) is extremely important and cannot be


sacrificed. Property (b) is less stringent and may be sacrificed.
Summary and Discussion of Design Guidelines
 Anomalies cause redundant work to be done during
 Insertion
 Modification
 Deletion
 Waste of storage space due to nulls and difficulty of
performing aggregation operations and joins due to null values
 Generation of invalid and spurious data during joins on
improperly related base relations.
Functional Dependencies
 Functional dependencies (FDs)
 Is a constraint between two sets of attributes from the database.
 Assumption
 The entire database is a single universal relation schema
R={A1,A2…An}
 Where A1,A2 … are the attributes.
Functional Dependencies
 Functional dependencies (FDs) are used to specify formal
measures of the "goodness" of relational designs.
 FDs and keys are used to define normal forms for relations.
 FDs are constraints that are derived from the meaning and
interrelationships of the data attributes.
 A set of attributes X functionally determines a set of attributes Y
if the value of X determines a unique value for Y.
Functional Dependencies
 If a relation schema has more than one key, each is called a
candidate key.
 One of the candidate keys is arbitrarily designated to be the
primary key, and the others are called secondary keys.
 A Prime attribute must be a member of some candidate key
 A Nonprime attribute is not a prime attribute that is, it is not a
member of any candidate key.
Types of Functional Dependencies
There are four major types of FD’s.
1. Partial Dependency and Fully Functional Dependency
Partial dependency : Suppose you have more than one attributes
in primary key. Let A be the non-prime key attribute. If A is not
dependent upon all prime key attributes, then partial dependency
exists.
Fully functional dependency : Let A be the non-prime key
attribute and value of A is dependent upon all prime key
attributes. Then A is said to be fully functional dependent.
Types of Functional Dependencies
2. Transitive Dependency and Non-transitive Dependency
 Transitive dependency: Transitive dependency is due to
dependency between non-prime key attributes. Suppose in a
relation R, X → Y (Y depends upon X), Y → Z (Z depends upon
Y), then X → Z (Z depends upon X). Therefore, Z is said to be
transitively dependent upon X.

 Non-transitive dependency : Any functional dependency which is


not transitive is known as Non-transitive dependency.
Types of Functional Dependencies
3. Single Valued and Multivalued Dependency
 Single valued dependency : In any relation R, if for a
particular value of X, Y has single value then it is known as
single valued dependency.
 Multivalued dependency (MVD) : In any relation R, if for a
particular value of X, Y has more then.
Types of Functional Dependencies
4. Trival Dependency and Non-trival Dependency
 Trival FD : In any relation R, X → Y is trival if Y ⊆ X (Y is
the subset of X).
 Non-trival FD : In any relation R, X →Y is non-trival if Y 
X (Y is not the subset of X).

06/22/2024 30
Examples of FD constraints (1)
 social security number determines employee name
 SSN -> ENAME
 project number determines project name and location
 PNUMBER -> {PNAME, PLOCATION}
 employee ssn and project number determines the hours per
week that the employee works on the project
 {SSN, PNUMBER} -> HOURS
Graphical representation of Functional Dependencies
Examples of FD constraints (2)
 An FD is a property of the attributes in the schema R, not of a
particular legal relation state r of R.
 It must be defined explicitly by someone who knows the
semantics of the attributes of R.
 The constraint must hold on every relation instance r(R)
 If K is a key of R, then K functionally determines all
attributes in R.
Redundant functional dependencies
 Given a set F of FDs, a FD AB of F is said to be redundant
with respect to the FDs of F iff AB can be derived from
the set of FDs F-{AB}
 Redundant FDs are extra and unnecessary and can be safely
removed from the set F.
 Eliminating redundant FDs allows us to minimize the set of
FDs.
 Membership Algorithm helps us to determine redundant FDs.
Membership Algorithm
Assuming F is a set of functional dependencies with A B ε F.
To determine if A B is redundant with respect to the other
FDs of the set F

1. Remove AB. Initialize G=F-{AB}. If G≠0 proceed to


step 2. else stop executing the algorithm since AB is non
redundant.

2. Apply inference rules to check if A  B can be deduced


from G.
Another Example
F={SSN  {ENAME, BDATE, ADDRESS, DNUMBER}, DNUMBER  {DNAME,
DMGRSSN}}

The inferred functional dependencies are

SSN  {DNAME, DMGRSSN}

SSN  SSN

DNUMBER  DNAME
 To determine a systematic way to infer dependencies, a set of
inference rules has to be discovered that can be used to infer new
dependencies from a given set of dependencies.
Inference Rules for Functional Dependencies

 F is the set of functional dependencies that are specified on


relation schema R.
 Schema designers specifies the most obvious FDs.
 The other dependencies can be inferred or deduced from FDs
in F.
Inference Rules for FDs (1)
 Given a set of FDs F, we can infer additional FDs that hold whenever the
FDs in F hold

Armstrong's inference rules:


 IR1. (Reflexive) If Y subset-of X, then X -> Y
 IR2. (Augmentation) If X -> Y, then XZ -> YZ

(Notation: XZ stands for X U Z)


 IR3. (Transitive) If X -> Y and Y -> Z, then X -> Z
 IR1, IR2, IR3 form a sound and complete set of inference rules
 These are rules hold and all other rules that hold can be deduced from these
Inference Rules for FDs (2)
Some additional inference rules that are useful:
 Projectivity or Decomposition: If X -> YZ, then X -> Y and
X -> Z
 Additivity or Union: If X -> Y and X -> Z, then X -> YZ
 Psuedotransitivity : If X -> Y and WY -> Z, then WX -> Z
 The last three inference rules, as well as any other inference
rules, can be deduced from IR1, IR2, and IR3 (completeness
property).
Closures, cover and Equivalence of FDs
 Given a set F of FDs, we can determine all the FDs that can be
logically implied by F.
 The most important application of this logic is in the
normalization process of relations.
 Closures
 Covers
 Equivalence of FDs
Example of Closure
 Department has one manager (DEPT_NO -> MGR_SSN)
 Manager has a unique phone number
(MGR_SSN->MGR_PHONE) then these two dependencies together imply that
(DEPT_NO->MGR_PHONE)
This defines a concept called as closure that includes all possible
dependencies that can be inferred from the given set F.
 The set of all dependencies that include F as well as all
dependencies that can be inferred from F is called the closure of
F (F+)
Closure of a FD
 Closure of a set F of FDs is the set F+ of all FDs that can be
inferred from F

 Closure of a set of attributes X with respect to F is the set X


+
of all attributes that are functionally determined by X

 X + can be calculated by repeatedly applying IR1, IR2, IR3


using the FDs in F
Algorithm: Determining X+ , the closure of X
under F
X+ = X
repeat
old X+ : = X+ ;
for each functional dependency Y  Z in F do
if Y is subset of X+ then X+ : = X+ U Z;
Until (X+ = old X+ )
Example : Computing closure
F: = { ssn  ename,
pnumber  {pname, plocation},
{ssn, pnumber}  hours }
Applying the algorithm:
{ssn}+ = {ssn, ename}
{pnumber}+ = {pnumber, pname, plocation}
{ssn, pnumber}+ = { ssn, pnumber, ename, pname,
plocation, hours}
Equivalence of Sets of FDs
 Two sets of FDs F and G are equivalent if:
 every FD in F can be inferred from G, and
 every FD in G can be inferred from F

 Hence, F and G are equivalent if F + =G +

Definition: F covers G if every FD in G can be inferred from F


(i.e., if G + subset-of F +)
 F and G are equivalent if F covers G and G covers F
 There is an algorithm for checking equivalence of sets of FDs
Non-Redundant Cover Algorithm

 Initialize G to F.
That is set G=F.
 Test every FD of G for redundancy using the Membership
Algorithm until there are no more FDs of G to be tested.
 The set G is a non redundant cover of F.
 E.g
Extraneous Attributes
 Further reduction of the size of the FDs of F by removing either
extraneous left attributes with respect to F or extraneous right
attributes with respect to F.
 F be a set of FDs over schema R and let A1A2B1B2.
 A1 is extraneous iff
 FΞF-{A1A2B1B2}U{A2B1B2}

 E.g
Extraneous Attributes

06/22/2024 48
Canonical Cover
 For a given set F of FDs, a canonical cover, denoted by
Fc, is a set of FDs where the following conditions are
simultaneously satisfied:
I. Every FD of Fc is simple. That is RHS of every FDs of
Fc has only one attribute

II. Fc is left-reduced.

III. Fc is nonredundant.
Minimal Sets of FDs
A set of FDs is minimal if it satisfies the following conditions:

I. Every dependency in F has a single attribute for its RHS.

II. We cannot remove any dependency from F and have a set of


dependencies that is equivalent to F.

III. We cannot replace any dependency X -> A in F with a


dependency Y -> A, where Y proper-subset-of X ( Y subset-
of X) and still have a set of dependencies that is equivalent to
F.
Algorithm : Finding a minimal cover F for a set of
functional dependencies E
1. Set F := E
2. Replace each f.d X { A1, A2, …, An} in F by the f.d s X  A1 , X 
A2 , …. X  An
3. For each f.d X  A in F
for each attribute B that is an element of X
if { { F – { X  A}} U { ( X –{B} )  A} is equivalent to F
then replace X  A with ( X-{B})  A in F
4. For each remaining f.d X  A in F
if { F - { X  A}} is equivalent to F,
then remove X  A from F
Example: Finding minimal cover of E
E : { B A, D A, AB D}
 Check if AB  D can be replaced with A  D or B  D

Given B A

Augmenting with B on both sides => BB  AB => B  AB

Now B  AB and given AB  D

Hence B  D
 Now E ‘ = {B A, D A, B D}

B D & D A => B  A So remove B  A

Minimum cover of E = { B  D, D  A}
Minimal Sets of FDs (2)
 Every set of FDs has an equivalent minimal set
 There can be several equivalent minimal sets
 There is no simple algorithm for computing a minimal set of
FDs that is equivalent to a set F of FDs
 To synthesize a set of relations, we assume that we start with a
set of dependencies that is a minimal set
Normalization
What is Normalization? is a database design technique that reduces data
redundancy and eliminates undesirable characteristics like Insertion, Update
and Deletion Anomalies.
Normalization is a process for evaluating and correcting table structures to
minimize data redundancies, thereby helping to eliminate data anomalies.
 Formal technique for analyzing a relation based on its primary key and the
functional dependencies between the attributes of that relation.
 Often executed as a series of steps. Each step corresponds to a specific
normal form, which has known properties.
 As normalization proceeds, the relations become progressively more
restricted (stronger) in format and also less vulnerable to update
anomalies.
 It helps us evaluate table structures and produce good tables.
06/22/2024 54
Normalization
Why do we need to normalize?
 To avoid redundancy (less storage space needed, and data is consistent)
 To avoid update/delete anomalies(Data consistency within the database)
 4 most commonly used normal forms are first (1NF), second (2NF), third
(3NF) and Boyce-Codd (BCNF) normal forms.
Unnormalized Normal Form (UNF)
 A table that contains one or more repeating groups.
 To create an unnormalized table, transform the data from the information
source into table format with columns and rows.

06/22/2024 55
Normalization
 This is the process which allows you to winnow out redundant
data within your database.
 The results of a well executed normalization process are the
same as those of a well-planned E-R model
 This involves restructuring the tables to successively meeting
higher forms of Normalization.
 A properly normalized database should have the following
characteristics
 Scalar values in each fields
 Absence of redundancy.
 Minimal use of null values.
 Minimal loss of information.

(Note: Winnow(Webster): To get rid of / eliminate inferior material


06/22/2024 56
Normalization: Process
 Eliminate Repeating Groups
 Make a separate table for each set of related attributes and
give each table a primary key.
 Eliminate Redundant Data
 If an attribute depends on only part of a multivalued key,
remove it to a separate table.
 Eliminate Columns not dependent on key
 If attributes do not contribute to a description of the key,
remove them to a separate table.

06/22/2024 57
Normalization: Process
 Isolate Independent multiple relationships
 No table may contain two or more 1:n or n:m relationships
that are not directly related.
 Isolate Semantically Related Multiple Relationships
 There may be practical constraints on information that
justify separating logically related many-to-many
relationships.

06/22/2024 58
Normalization: Levels
 Levels of normalization based on the amount of redundancy in
the database.
 Relational theory defines a number of structure conditions called
Normal Forms that assure that certain data anomalies do not
occur in a database.
 Various levels of normalization are:
 First Normal Form (1NF)

Redundancy
 Second Normal Form (2NF)

Number of Tables
 Third Normal Form (3NF)

Complexity
 Boyce-Codd Normal Form (BCNF)
 Fifth Normal Form (5NF)
 Fourth Normal Form (4NF)

Most
Mostdatabases
databasesshould
shouldbe
be3NF
3NFor
orBCNF
BCNFininorder
ordertotoavoid
avoidthe
thedatabase
database
06/22/2024 anomalies.
anomalies. 59
Normalization: Levels
1NF 1NF Keys; No repeating groups or multi-
valued
2NF
3NF/BCNF 2NF No partial dependencies
4NF 3NF No transitive dependencies
5NF BCNF Determinants are candidate keys
4NF No multivalued dependencies
5NF No multivalued dependencies
4NF No multivalued dependencies

Each
Eachhigher
higherlevel
levelisisaasubset
subsetofofthe
thelower
lowerlevel
level

06/22/2024 60
Normalization: First Normal Form (1NF)
A table is considered to be in 1NF if all the fields contain
only scalar values (as opposed to list of values).
Example (Not 1NF)

ISBN Title AuName AuPhone PubName PubPhone Price

0-321-32132-1 Balloon Sleepy, 321-321-1111, Small House 714-000-0000 $34.00


Snoopy, 232-234-1234,
Grumpy 665-235-6532

0-55-123456-9 Main Street Jones, 123-333-3333, Small House 714-000-0000 $22.95


Smith 654-223-3455
0-123-45678-0 Ulysses Joyce 666-666-6666 Alpha Press 999-999-9999 $34.00

1-22-233700-0 Visual Roman 444-444-4444 Big House 123-456-7890 $25.00


Basic

Author
Authorand
andAuPhone
AuPhonecolumns
columnsare
arenot
notscalar
scalar

06/22/2024 61
Normalization
1NF: Decomposition
1. Place all items appearing in the repeating group in a new table
2. Designate a primary key for each new table produced.
3. Create a relationship between the two tables
• For 1:N relation duplicate the P.K. from 1 side to many side
• For M:N relation create a new table with P.K. from both tables
Example (1NF) ISBN AuName AuPhone

ISBN Title PubName PubPhone Price 0-321-32132-1 Sleepy 321-321-1111

0-321-32132-1 Balloon Small House 714-000-0000 $34.00


0-321-32132-1 Snoopy 232-234-1234

0-55-123456-9 Main Street Small House 714-000-0000 $22.95


0-321-32132-1 Grumpy 665-235-6532

0-123-45678-0 Ulysses Alpha Press 999-999-9999 $34.00


0-55-123456-9 Jones 123-333-3333

1-22-233700-0 Visual Big House 123-456-7890 $25.00


0-55-123456-9 Smith 654-223-3455
Basic

0-123-45678-0 Joyce 666-666-6666

1-22-233700-0 Roman 444-444-4444

06/22/2024 62
Normalization: Functional Dependencies
1. If one set of attributes in a table determines another set of
attributes in the table, then the second set of attributes is
said to be functionally dependent on the first set of
attributes.

Example 1
ISBN Title Price Table Scheme: {ISBN, Title, Price}
0-321-32132-1 Balloon $34.00 Functional Dependencies: {ISBN}  {Title}
0-55-123456-9 Main Street $22.95 {ISBN} 
0-123-45678-0 Ulysses $34.00
{Price}
1-22-233700-0 Visual $25.00
Basic

06/22/2024 63
Normalization: Functional Dependencies
Example 2
PubID PubName PubPhone Table Scheme: {PubID, PubName, PubPhone}
1 Big House 999-999-9999 Functional Dependencies: {PubId} 
2 Small House 123-456-7890 {PubPhone}
3 Alpha Press 111-111-1111 {PubId} 
{PubName}
{PubName, PubPhone} 
Example 3 {PubID}
AuID AuName AuPhone Table Scheme: {AuID, AuName, AuPhone}
1 Sleepy 321-321-1111 Functional Dependencies: {AuId} 
2 Snoopy 232-234-1234 {AuPhone}
3 Grumpy 665-235-6532 {AuId} 
4 Jones 123-333-3333 {AuName}
5 Smith 654-223-3455 {AuName, AuPhone} 
6 Joyce 666-666-6666 {AuID}
7 Roman 444-444-4444

06/22/2024 64
Normalization: Dependency Diagram
 The primary key components are bold, underlined, and shaded in a
different color.
 The arrows above entities indicate all desirable dependencies, i.e.,
dependencies that are based on PK.
 The arrows below the dependency diagram indicate less desirable
dependencies -- partial dependencies and transitive dependencies

Example:

06/22/2024 65
Normalization: Functional Dependencies(Example)
I. Database to track reviews of papers submitted to an academic
conference. Prospective authors submit papers for review and
possible acceptance in the published conference proceedings.
Details of the entities:
 Author information includes a unique author number, a name, a mailing
address, and a unique (optional) email address.
 Paper information includes the primary author, the paper number, the title,
the abstract, and review status (pending, accepted, rejected)
 Reviewer information includes the reviewer number, the name, the mailing
address, and a unique (optional) email address
 A completed review includes the reviewer number, the date, the paper
number, comments to the authors, comments to the program chairperson,
and ratings (overall, originality, correctness, style, clarity)

06/22/2024 66
Normalization: Functional Dependencies(Example)
Functional Dependencies
 AuthNo  AuthName, AuthEmail, AuthAddress
 AuthEmail  AuthNo
 PaperNo  Primary-AuthNo, Title, Abstract, Status
 RevNo  RevName, RevEmail, RevAddress
 RevEmail  RevNo
 RevNo, PaperNo  AuthComm, Prog-Comm, Date,
Rating1, Rating2, Rating3, Rating4, Rating5

06/22/2024 67
Normalization: Second Normal Form (2NF)
For a table to be in 2NF, there are two requirements:
 The database is in first normal form
 All nonkey attributes in the table must be functionally dependent on the
entire primary key
Note: Remember that we are dealing with non-key attributes

Example 1 (Not 2NF)


Scheme  {StudentId, CourseId, StudentName, CourseTitle, Grade}
1. Key  {StudentId, CourseId}
2. {StudentId}  {StudentName}
3. {CourseId}  {CourseTitle}
4. {StudentId, CourseId}  {Grade}
5. StudentName depends on a subset of the key I.e. StudentId
6. CourseTitle depends on a subset of the key. i.e. CourseId

06/22/2024 68
Normalization: Second Normal Form (2NF)
Example 2 (Not 2NF)
Scheme  {City, Street, HouseNumber, HouseColor, CityPopulation}
1. key  {City, Street, HouseNumber}
2. {City, Street, HouseNumber}  {HouseColor}
3. {City}  {CityPopulation}
4. CityPopulation does not belong to any key.
5. CityPopulation is functionally dependent on the City which is a proper subset
of the key

Example 3 (Not 2NF)


Scheme  {studio, movie, budget, studio_city}
6. Key  {studio, movie}
7. {studio, movie}  {budget}
8. {studio}  {studio_city}
9. studio_city is not a part of a key
10. studio_city functionally depends on studio which is a proper subset of the key

06/22/2024 69
Normalization: 2NF( Decomposition)
1. If a data item is fully functionally dependent on only a part of the
primary key, move that data item and that part of the primary key
to a new table.
2. If other data items are functionally dependent on the same part of
the key, place them in the new table also
3. Make the partial primary key copied from the original table the
primary key for the new table.
(Place all items that appear in the repeating group in a new table)
Example 1 (Convert to 2NF)
Old Scheme  {StudentId, CourseId, StudentName, CourseTitle, Grade}
New Scheme  {StudentId, StudentName}
New Scheme  {CourseId, CourseTitle}
New Scheme  {StudentId, CourseId, Grade}

06/22/2024 70
Normalization: 2NF( Decomposition)
Example 2 (Convert to 2NF)
Old Scheme  {StudioID, Movie, Budget, StudioCity}
New Scheme  {Movie, StudioID, Budget}
New Scheme  {Studio, City}

Example 3 (Convert to 2NF)


Old Scheme  {City, Street, HouseNumber, HouseColor, CityPopulation}
New Scheme  {City, Street, HouseNumber, HouseColor}
New Scheme  {City, CityPopulation}

06/22/2024 71
Normalization: Third Normal Form (3NF)
 This form dictates that all non-key attributes of a table
must be functionally dependent on a candidate key such
that there are no interdependencies among non-key
attributes i.e., there should be no transitive dependencies

 For a table to be in 3NF, there are two requirements


 The table should be second normal form
 No attribute is transitively dependent on the primary key
Title PubID BookType Price
Moby Dick 1 Adventure 34.95
Example (Not in 3NF) Giant 2 Adventure 34.95
MobyDick 2 Adventure 34.95
Scheme  {Title, PubID, BookType, Price }
Iliad 1 War 44.95
1. Key  {Title, PubId} Romeo &Juliet 1 Love 59.90
2. {Title, PubId}  {BookType}
3. {BookType}  {Price}
4. Both Price and BookType depend on a key hence 2NF
5. Transitively {Title, PubID}  {Price} hence not in 3NF
06/22/2024 72
Normalization: Third Normal Form (3NF)
Example 2 (Not in 3NF)
Scheme  {StudioID, StudioCity, CityTemp}
1. Primary Key  {StudioID}
2. {StudioID}  {StudioCity}
3. {StudioCity}  {CityTemp}
4. {StudioID}  {CityTemp}
5. Both StudioCity and CityTemp depend on the entire key hence 2NF
6. CityTemp transitively depends on Studio hence violates 3NF

BuildingID Contractor Fee

Example 3 (Not in 3NF) 100 Randolph 1200

150 Ingersoll 1100


Scheme  {BuildingID, Contractor, Fee}
200 Randolph 1200
7. Primary Key  {BuildingID}
250 Pitkin 1100
8. {BuildingID}  {Contractor}
9. {Contractor}  {Fee} 300 Randolph 1200

10. {BuildingID}  {Fee}


11. Fee transitively depends on the BuildingID
12. Both Contractor and Fee depend on the entire key hence 2NF

06/22/2024 73
Normalization:3NF(Decomposition)
1. Move all items involved in transitive dependencies to a new entity.
2. Identify a primary key for the new entity.
3. Place the primary key for the new entity as a foreign key on the
original entity.
Example 1 (Convert to 3NF)
Old Scheme  {Title, PubID, BookType, Price }
New Scheme  {BookType, Price}
New Scheme  {Title, PubID, BookType}

06/22/2024 74
Normalization:3NF( Decomposition)
Example 2 (Convert to 3NF)
Old Scheme  {StudioID, StudioCity, CityTemp}
New Scheme  {StudioID, StudioCity}
New Scheme  {StudioCity, CityTemp}

BuildingID Contractor
Example 3 (Convert to 3NF) 100 Randolph
Old Scheme  {BuildingID, Contractor, Fee} 150 Ingersoll
New Scheme  {BuildingID, Contractor} 200 Randolph

New Scheme  {Contractor, Fee} 250 Pitkin


300 Randolph

Contractor Fee

Randolph 1200
Ingersoll 1100
Pitkin 1100

06/22/2024 75
Normalization: Boyce-Codd Normal Form (BCNF)
 BCNF does not allow dependencies between attributes that belong to candidate
keys.
 BCNF is a refinement of the third normal form in which it drops the restriction
of a non-key attribute from the 3rd normal form.
 Third normal form and BCNF are not same if following conditions are true:
 The table has two or more candidate keys
 At least two of the candidate keys are composed of more than one attribute
 The keys are not disjoint i.e. The composite candidate keys share some attributes
Example 1 - Address (Not in BCNF)
Scheme  {City, Street, ZipCode}
1. Key1  {City, Street }
2. Key2  {ZipCode, Street}
3. No non-key attribute hence 3NF
4. {City, Street}  {ZipCode}
5. {ZipCode}  {City}
6. Dependency between attributes belonging to a key

06/22/2024 76
Normalization: Boyce-Codd Normal Form (BCNF)
Example 2 - Movie (Not in BCNF)
Scheme  {MovieTitle, StudioID, MovieID, ActorName, Role, Payment }
1. Key1  {MovieTitle, StudioID, ActorName}
2. Key2  {MovieID, ActorName}
3. Both role and payment functionally depend on both candidate keys thus 3NF
4. {MovieID}  {MovieTitle}
5. Dependency between MovieID & MovieTitle Violates BCNF

Example 3 - Consulting (Not in BCNF)


Scheme  {Client, Problem, Consultant}
(Only one consultant works on a specific client problem)
6. Key1  {Client, Problem}
7. Key2  {Client, Consultant}
8. No non-key attribute hence 3NF
9. {Client, Problem}  {Consultant}
10. {Client, Consultant}  {Problem}
11. Dependency between attributes belonging to keys violates BCNF

06/22/2024 77
Normalization: BCNF(Decomposition)
1. Place the two candidate primary keys in separate entities
2. Place each of the remaining data items in one of the resulting
entities according to its dependency on the primary key.
Example 1 (Convert to BCNF)
Old Scheme  {City, Street, ZipCode }
New Scheme1  {ZipCode, Street}
New Scheme2  {City, Street}
 Loss of relation {ZipCode}  {City}
Alternate New Scheme1  {ZipCode, Street }
Alternate New Scheme2  {ZipCode, City}

06/22/2024 78
Normalization: Decomposition (Loss of
Information)
1. If decomposition does not cause any loss of information, it is
called a lossless decomposition.
2. If a decomposition does not cause any dependencies to be lost it
is called a dependency-preserving decomposition.
3. Any table scheme can be decomposed in a lossless way into a
collection of smaller schemas that are in BCNF form. However,
the dependency preservation is not guaranteed.
4. Any table can be decomposed in a lossless way into 3rd normal
form that also preserves the dependencies.
• 3NF may be better than BCNF in some cases

Use
Useyour
yourown
ownjudgment
judgmentwhen
whendecomposing
decomposingschemas
schemas

06/22/2024 79
Normalization: BCNF(Decomposition)
Example 2 (Convert to BCNF)
Old Scheme  {MovieTitle, StudioID, MovieID, ActorName, Role, Payment }
New Scheme  {MovieID, ActorName, Role, Payment}
New Scheme  {MovieTitle, StudioID, ActorName}
 Loss of relation {MovieID}  {MovieTitle}
New Scheme  {MovieID, ActorName, Role, Payment}
New Scheme  {MovieID, MovieTitle}
 We got the {MovieID}  {MovieTitle} relationship back
Example 3 (Convert to BCNF)
Old Scheme  {Client, Problem, Consultant}
New Scheme  {Client, Consultant}
New Scheme  {Client, Problem}
Loss or Relation {Consultant, Problem}
New Schema  {Client, Consultant}
New Schema  {Consultant, Problem}

06/22/2024 80
Normalization: Fourth Normal Form (4NF)
 Fourth normal form eliminates independent many-to-one relationships
between columns.
 To be in Fourth Normal Form,
 a relation must first be in Boyce-Codd Normal Form.
 a given relation may not contain more than one multi-valued attribute.
Example (Not in 4NF)
Scheme  {MovieName, ScreeningCity, Genre)
Primary Key: {MovieName, ScreeningCity, Genre)
1. All columns are a part of the only candidate key, hence BCNF
2. Many Movies can have the same Genre Movie ScreeningCity Genre
3. Many Cities can have the same movie Hard Code Los Angles Comedy
Hard Code New York Comedy
4. Violates 4NF
Bill Durham Santa Cruz Drama

Bill Durham Durham Drama


The Code Warrier New York Horror

06/22/2024 81
Normalization: Fourth Normal Form (4NF)
Manager Child Employee
Example 2 (Not in 4NF)
Jim Beth Alice
Scheme  {Manager, Child, Employee}
Mary Bob Jane
1. Primary Key  {Manager, Child, Employee}
Mary NULL Adam
2. Each manager can have more than one child
3. Each manager can supervise more than one employee
4. 4NF Violated

Example 3 (Not in 4NF)


Employee Skill Language
Scheme  {Employee, Skill, ForeignLanguage}
1234 Cooking French
5. Primary Key  {Employee, Skill, Language }
1234 Cooking German
6. Each employee can speak multiple languages
1453 Carpentry Spanish
7. Each employee can have multiple skills
1453 Cooking Spanish
8. Thus violates 4NF
2345 Cooking Spanish

06/22/2024 82
Normalization:4NF(Decomposition)
1. Move the two multi-valued relations to separate tables
2. Identify a primary key for each of the new entity.

Example 1 (Convert to 4NF)


Old Scheme  {MovieName, ScreeningCity, Genre}
New Scheme  {MovieName, ScreeningCity}
New Scheme  {MovieName, Genre}

Movie Genre Movie ScreeningCity


Hard Code Comedy Hard Code Los Angles

Bill Durham Drama Hard Code New York

The Code Warrier Horror Bill Durham Santa Cruz

Bill Durham Durham

The Code Warrier New York

06/22/2024 83
Normalization: 4NF(Decomposition)
Example 2 (Convert to 4NF) Manager Child Manager Employee
Old Scheme  {Manager, Child, Employee} Jim Beth Jim Alice

New Scheme  {Manager, Child} Mary Bob Mary Jane

New Scheme  {Manager, Employee} Mary Adam

Example 3 (Convert to 4NF)


Old Scheme  {Employee, Skill, ForeignLanguage}
New Scheme  {Employee, Skill}
New Scheme  {Employee, ForeignLanguage}

Employee Skill Employee Language


1234 Cooking 1234 French

1453 Carpentry 1234 German

1453 Cooking 1453 Spanish

2345 Cooking 2345 Spanish

06/22/2024 84
Normalization:Fifth Normal Form (5NF)
 Fifth normal form applies to M-Way relationships.
 In 5NF all tables are broken into as many tables as possible in order to
avoid redundancy.
 Once it is in fifth normal form it cannot be broken into smaller
relations without changing the facts or the meaning.

06/22/2024 85
.

Thank You!

06/22/2024 86

You might also like