0% found this document useful (0 votes)
14 views42 pages

CIS340 Lecture 15-3

Uploaded by

gigesa39
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views42 pages

CIS340 Lecture 15-3

Uploaded by

gigesa39
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 42

‫‪Functional Dependencies‬‬

‫‪and Normalization for‬‬


‫‪Relational Databases‬‬

‫تنبيه ‪ :‬شرائح العرض (‪ )Slides‬هي وسيلة لتوضيح الدرس واداة من‬


‫االدوات في ذلك‪.‬‬

‫حيث المرجع االساسي للمادة هي الكتاب المعتمد في وصف‬


‫المقرر‬
Outline
2

 Normalization Definition
 Normal From
 Definitions of Keys and Attributes
Participating in Keys
 First Normal Form
 Second Normal Form 2NF
 Third Normal Form 3NF
 General Definition of 2NF (For Multiple Keys)
 General Definition of 3NF (For Multiple Keys)
 Boyce-Codd Normal Form (BCNF)
Normalization
3

 The process of decomposing


unsatisfactory "bad" relations by
breaking up their attributes into
smaller relations

 Is a formal technique for analyzing


relations based on their primary key and
functional dependencies
Normalization
4

 The technique involve a series of rules


that can be used to test individual relation
to certify whether it satisfy a certain
normal form

 When a requirement is not met, the


relation violating the requirement must be
decomposed into more than one relation
that individually met the requirement of
the normalization
Chapter 10-4
Normal From
5

 Normal form: Condition using keys and FDs of a


relation to certify whether a relation schema is in
a particular normal form
 Three normal form were initially proposed by
Codd
 First Normal Form 1FN
 Second Normal Form 2FN
 Third Normal Form 3FN
 Boyce-Codd introduce a stronger definition of
third normal form called ByCode Normal Form
(BCNF)
 Higher normal form such as 4NF and 5NF deal
with a situation that are very rare
Normal From
6

 Normalization is often executed as a series


of steps

 Each step corresponds to a specific normal


form

 As a normalization proceeds, the relations


become more stronger in format that
avoid anomaly

Chapter 10-6
 In general, it is recommended that we
Definitions of Keys and
Attributes Participating in
7
Keys
 A superkey of a relation schema
R = {A1, A2, ...., An}
 is a set of attributes S subset-of R with the
property that
 no two tuples t1 and t2 in any legal relation state r
of R will have t1[S] = t2[S]

 A key K is a superkey with the additional


property that removal of any attribute
from K will cause K not to be a superkey
any more. Chapter 10-7
Definitions of Keys and
Attributes Participating in
8 Keys
 If a relation schema has more than one key,
each is called a candidate key.
 One of the candidate keys is arbitrarily
designated to be the primary key, and the
others are called secondary keys.
 Prime attribute
 Is an attribute that is member of the primary key
K
 Nonprime attribute
 is not a prime attribute
 it is not a member of any candidate key.
Chapter 10-8
First Normal Form
9

 Disallows
 composite attributes
 multivalued attributes
 nested relations
 attributes whose values for an individual tuple are non-
atomic

 It state that the domain of an attribute must


include only atomic values (single)

 In other word, it disallow relations within relation

Chapter 10-9
1NF Example (1)
10

DEPARTMERNT
DNAME DNUMBER DMGRSSN DLOCATIO
NS
 We assume that each department can
have a number of locations
 This relation is not in 1NF, because
DLOCATION is not an atomic attribute
 There are two ways we can look at the
DLOCATIONS attribute

Chapter 10-10
1NF Example (1)
11

 The domain of DLOCATIONS contains atomic values

DNAME DNUMBER DMGRSSN DLOCATIO


NS
Researcher 5 333445555 Houston
Researcher 5 333445555 Stanford
 In this case, DLOCATIONS is not functionally
dependent on the primary key, and cause update
anomaly because it introduce redundancy in relation

Chapter 10-11
1NF Example (1)
12

 The domain of DLOCATIONS contains set


of values and hence is none-atomic
DNAME DNUMBE DMGRSSN DLOCATIONS
R
Researcher 5 333445555 {Houston,
Stanford}
 In this case, DLOCATIONS is functionally
dependent on the primary key because each
set is considered a single member

Chapter 10-12
1NF Example (1)
13

 In either cases, the DEPARTMENT relation is not


1NF
 There are three main techniques to achieve
first normal form for such relation
1- Remove the attribute DLOCATIONS that violate
1NF and place it in separate relation
DEPT_LOCATION along with the primary key
DNUMBER
DEPARTMERNT DEPT_LOCATION
DNAM DNUMBE MGRSSN DNUMB DLOCATION
E R ER
 This decompose the non-1NF into 1NF
1NF Example (1)
14

2- Expand the key to be DNUMBER and


DLOCATION in the same relation

 This solution has a disadvantage of introducing redundancy


1NF Example (1)
15

3- if a maximum number of values is known


for the attribute DLOCATION to be three
locations at most for the department then
replace the attribute DELOCATIONS into
three atomic attributes DLOCATION1,
DLOCATION2, and DLOCATION 3

DNAM DNUMB MGRSS DLOCATIO DLOCATION DLOCATIO


E ER N N1 2 N3
1NF Example (1)
16

 This solution has a disadvantage


 introducing null values if most of the department have

fewer than three locations


 Querying on this attribute become more difficult ,for

example, consider how you would write the query “List


the departments that have “Bellair” as one of their
locations”
 Select Dname from Department where

Dlocation1=“Bellair” OR Dlocation2=“Bellair” OR
Dlocation3=“Bellair”
1NF Example (1)
17

 The first solution is generally considered


the best one because
 it does not suffer from redundancy
 completely general, and
 having no limit placed on a maximum
number of values
Normalization nested relations into 1NF
18

 1NF also disallows


multivalued
attributes that are
themselves
composite.
 These are called
nested relations
because each
tuple can have a
relation within it
Normalization nested relations into 1NF
19

 The schema of the above relation can be represented as


 EMP_PROJ(SSN, ENAME, {PROJS(PNUMBER,
HOURS)})

 The set braces {} identify the attribute PROJS as


multivalued and the compound attributes that form
PROJS listed between parentheses()

 Notice that, SSN is the primary key of the


EMP_PROJ while PNUMBER is the partial key of the
nested relation,
Normalization nested relations into 1NF
20

 To normalize this relation to 1NF, we remove


the nested relation attributes into a new
relation and propagate the primary key into
it
Normalization nested relations into 1NF
21

 Example
 PERSON(SS#, {CAR_LIC}, {PHONE#})
 This relation represents the fact that a person has
multiple cars and multiple phones

 The right way to deal with this relations is to


decompose it into separate relations
 P_Cars(SS#, CAR_LIC)
 P_Phone(SS#, PHONE#)
Second Normal Form 2NF
22

 2NF Uses the concepts of FDs, primary key


 Full functional dependency
 a FD Y → Z where removal of any attribute from Y means
the FD does not hold any more
 Examples:
 {SSN, PNUMBER} → HOURS
 is a full FD

since neither SSN → HOURS nor PNUMBER → HOURS hold

{SSN, PNUMBER} → ENAME
 is not a full FD (it is called a partial dependency )
 since SSN → ENAME also holds

Chapter 10-22
Second Normal Form 2NF
23

 A relation schema R is in second


normal form (2NF) if every non-prime
attribute A in R is fully functionally
dependent on the primary key

 R can be decomposed into 2NF relations


via the process of 2NF normalization

Chapter 10-23
Second Normal Form 2NF
24

 The test for 2NF involves testing for FDs


whose left-hand side attributes are part
of the primary key.

 If the primary key contains a single


attribute, the test need not be applied at
all
Normalizing into 2NF
25

 Consider the following relation

 The non-prime attribute ENAME in FD2 SSN → ENAME violate 2NF


 The non-prime attribute PNAME and PLOCATION in FD3 PNUMBER
→ PNAME and PNUMBER → PLOCATION violate 2NF

Chapter 10-25
Third Normal Form 3NF
26

 Transitive functional dependency


 a FD X → Z that can be derived from two FDs X → Y and Y → Z
 Examples
 SSN → DMGRSSN
 is a transitive FD
 Since SSN → DNUMBER and DNUMBER → DMGRSSN hold
 SSN → ENAME
 is non-transitive
 since there is no set of attributes X where SSN → X and X → ENAME

Chapter 10-26
Third Normal Form 3NF
27

 A relation schema R is in third normal form


(3NF) if
 It is in 2NF and
 No non-prime attribute A in R is transitively
dependent on the primary key

 R can be decomposed into 3NF relations via


the process of 3NF normalization

Chapter 10-27
Third Normal Form 3NF
28

 The relation EMP_DEPT is in 2NF since there is no partial dependencies


on a key exist
 It is not in 3NF because non-prime attribute DNAME and DMGRSSN is
transitively based on the primary key SSN
General Definition of 2NF
(For Multiple Keys)
29

 The above definitions consider the primary key only

 The following more general definitions take into account


relations with multiple candidate keys

 A relation schema R is in second normal form (2NF) if every


non-prime attribute A in R is fully functionally dependent on
every key of R

 The test for 2NF involves testing for FDs whose left-hand side
attributes are part of the primary key.

 If the primary key contains a single attribute, the test need


not be applied at all
Chapter 10-29
2NF Example
30

 Consider the relation schema LOTS which describes


parcels of land for sale in various country

 Suppose that there is two candidate keys PROPERTY_ID#


, {COUNTY_NAME, LOT#}, we choose PROPERTY_ID# as
a primary key, so FD1, FD2 are hold
2NF Example
31

 Suppose that the following two additional FDs


hold in LOTS
FD3: COUNTY_NAME → TAX_RATE
FD4: AREA → PRICE

 The FD3 violate 2NF because TAX_RATE is


partially dependent on the candidate key
{COUNTY_NAME, LOT# }
 The FD4 doe not violate the 2NF
2NF Example
32

 To Normalize LOTS into 2NF, we decompose it


into two relations LOTS1, LOTS2
 by removing the attribute TAX_RATE that violates
2NF and placing it into another relation LOTS2
General Definition of 3NF
(For Multiple Keys)
33

 Superkey of relation schema R


 a set of attributes S of R that contains a key of R

 A relation schema R is in third normal form (3NF)


if whenever a FD X → A holds in R, then either:
(a) X is a superkey of R, or
(b) A is a prime attribute of R

Chapter 10-33
3NF Example
34

 According to this definition, LOTS2 is in


3NF

 However, FD4 in LOTS1 violates 3NF because


AREA is not a superkey and PRICE is not a
prime attribute in LOTS1
3NF Example
35

 To normalize LOTS1 into 3NF, we


decompose it into the relation schemas
LOTS1A and LOTS1B
 We construct LOTS1A by removing the
attribute price that violate 3NF
General Definitions of Second
and Third Normal Forms
BCNF Example
37

 Suppose that we have thousands of lots in the


relation but the lots are from only two countries:
Quebec, Toronto

 Suppose also that the size in Quebec are only


0.5, 0.6, 0.7, 0.8, 0.9, and 1.0 acres

 Whereas the lot sizes in Toronto are restricted


to 1.1, 1.2, ….., 1.9, and 2.0 acres
Dr. Mohammad El-Helly
BCNF Example
38

In such a situation we would have the additional FD


FD5: Area → County_NAME

 The LOTS1A still in 3NF because


COUNTY_NAME is a prime attribute but it is
not in BCNF because AREA is not a superkey
Dr. Mohammad El-Helly
BCNF Example
39

 To normalize it into BCNF we decompose


LOTS1A into two relations LOTS1AX,
LOTS1AY

Dr. Mohammad El-Helly


Example
40

Dr. Mohammad El-Helly


Example
41

 Un-Normalized Form

R(ID, Name, Type, Age, Owner, {Visit(VDate, PNO,


PNAME)})
R
V
ID Name Type Age Owner
VDate PNO PName

Dr. Mohammad El-Helly


Example
42

R1(ID,Name, Type Age,


1NF Owner)
R2(ID, Visit Date, PNO, PName)

R2A(ID, VisitDate,
PNO )
2NF R2B(PNO,
PName )

R2A(ID, VisitDate,
3NF PNO )
R2B(PNO,
PName )

Dr. Mohammad El-Helly

You might also like