Chapter 4-Functional Dependancy and Normalization
Chapter 4-Functional Dependancy and Normalization
06/22/2024 30
Examples of FD constraints (1)
social security number determines employee name
SSN -> ENAME
project number determines project name and location
PNUMBER -> {PNAME, PLOCATION}
employee ssn and project number determines the hours per
week that the employee works on the project
{SSN, PNUMBER} -> HOURS
Graphical representation of Functional Dependencies
Examples of FD constraints (2)
An FD is a property of the attributes in the schema R, not of a
particular legal relation state r of R.
It must be defined explicitly by someone who knows the
semantics of the attributes of R.
The constraint must hold on every relation instance r(R)
If K is a key of R, then K functionally determines all
attributes in R.
Redundant functional dependencies
Given a set F of FDs, a FD AB of F is said to be redundant
with respect to the FDs of F iff AB can be derived from
the set of FDs F-{AB}
Redundant FDs are extra and unnecessary and can be safely
removed from the set F.
Eliminating redundant FDs allows us to minimize the set of
FDs.
Membership Algorithm helps us to determine redundant FDs.
Membership Algorithm
Assuming F is a set of functional dependencies with A B ε F.
To determine if A B is redundant with respect to the other
FDs of the set F
SSN SSN
DNUMBER DNAME
To determine a systematic way to infer dependencies, a set of
inference rules has to be discovered that can be used to infer new
dependencies from a given set of dependencies.
Inference Rules for Functional Dependencies
Initialize G to F.
That is set G=F.
Test every FD of G for redundancy using the Membership
Algorithm until there are no more FDs of G to be tested.
The set G is a non redundant cover of F.
E.g
Extraneous Attributes
Further reduction of the size of the FDs of F by removing either
extraneous left attributes with respect to F or extraneous right
attributes with respect to F.
F be a set of FDs over schema R and let A1A2B1B2.
A1 is extraneous iff
FΞF-{A1A2B1B2}U{A2B1B2}
E.g
Extraneous Attributes
06/22/2024 48
Canonical Cover
For a given set F of FDs, a canonical cover, denoted by
Fc, is a set of FDs where the following conditions are
simultaneously satisfied:
I. Every FD of Fc is simple. That is RHS of every FDs of
Fc has only one attribute
II. Fc is left-reduced.
III. Fc is nonredundant.
Minimal Sets of FDs
A set of FDs is minimal if it satisfies the following conditions:
Given B A
Hence B D
Now E ‘ = {B A, D A, B D}
Minimum cover of E = { B D, D A}
Minimal Sets of FDs (2)
Every set of FDs has an equivalent minimal set
There can be several equivalent minimal sets
There is no simple algorithm for computing a minimal set of
FDs that is equivalent to a set F of FDs
To synthesize a set of relations, we assume that we start with a
set of dependencies that is a minimal set
Normalization
What is Normalization? is a database design technique that reduces data
redundancy and eliminates undesirable characteristics like Insertion, Update
and Deletion Anomalies.
Normalization is a process for evaluating and correcting table structures to
minimize data redundancies, thereby helping to eliminate data anomalies.
Formal technique for analyzing a relation based on its primary key and the
functional dependencies between the attributes of that relation.
Often executed as a series of steps. Each step corresponds to a specific
normal form, which has known properties.
As normalization proceeds, the relations become progressively more
restricted (stronger) in format and also less vulnerable to update
anomalies.
It helps us evaluate table structures and produce good tables.
06/22/2024 54
Normalization
Why do we need to normalize?
To avoid redundancy (less storage space needed, and data is consistent)
To avoid update/delete anomalies(Data consistency within the database)
4 most commonly used normal forms are first (1NF), second (2NF), third
(3NF) and Boyce-Codd (BCNF) normal forms.
Unnormalized Normal Form (UNF)
A table that contains one or more repeating groups.
To create an unnormalized table, transform the data from the information
source into table format with columns and rows.
06/22/2024 55
Normalization
This is the process which allows you to winnow out redundant
data within your database.
The results of a well executed normalization process are the
same as those of a well-planned E-R model
This involves restructuring the tables to successively meeting
higher forms of Normalization.
A properly normalized database should have the following
characteristics
Scalar values in each fields
Absence of redundancy.
Minimal use of null values.
Minimal loss of information.
06/22/2024 57
Normalization: Process
Isolate Independent multiple relationships
No table may contain two or more 1:n or n:m relationships
that are not directly related.
Isolate Semantically Related Multiple Relationships
There may be practical constraints on information that
justify separating logically related many-to-many
relationships.
06/22/2024 58
Normalization: Levels
Levels of normalization based on the amount of redundancy in
the database.
Relational theory defines a number of structure conditions called
Normal Forms that assure that certain data anomalies do not
occur in a database.
Various levels of normalization are:
First Normal Form (1NF)
Redundancy
Second Normal Form (2NF)
Number of Tables
Third Normal Form (3NF)
Complexity
Boyce-Codd Normal Form (BCNF)
Fifth Normal Form (5NF)
Fourth Normal Form (4NF)
Most
Mostdatabases
databasesshould
shouldbe
be3NF
3NFor
orBCNF
BCNFininorder
ordertotoavoid
avoidthe
thedatabase
database
06/22/2024 anomalies.
anomalies. 59
Normalization: Levels
1NF 1NF Keys; No repeating groups or multi-
valued
2NF
3NF/BCNF 2NF No partial dependencies
4NF 3NF No transitive dependencies
5NF BCNF Determinants are candidate keys
4NF No multivalued dependencies
5NF No multivalued dependencies
4NF No multivalued dependencies
Each
Eachhigher
higherlevel
levelisisaasubset
subsetofofthe
thelower
lowerlevel
level
06/22/2024 60
Normalization: First Normal Form (1NF)
A table is considered to be in 1NF if all the fields contain
only scalar values (as opposed to list of values).
Example (Not 1NF)
Author
Authorand
andAuPhone
AuPhonecolumns
columnsare
arenot
notscalar
scalar
06/22/2024 61
Normalization
1NF: Decomposition
1. Place all items appearing in the repeating group in a new table
2. Designate a primary key for each new table produced.
3. Create a relationship between the two tables
• For 1:N relation duplicate the P.K. from 1 side to many side
• For M:N relation create a new table with P.K. from both tables
Example (1NF) ISBN AuName AuPhone
06/22/2024 62
Normalization: Functional Dependencies
1. If one set of attributes in a table determines another set of
attributes in the table, then the second set of attributes is
said to be functionally dependent on the first set of
attributes.
Example 1
ISBN Title Price Table Scheme: {ISBN, Title, Price}
0-321-32132-1 Balloon $34.00 Functional Dependencies: {ISBN} {Title}
0-55-123456-9 Main Street $22.95 {ISBN}
0-123-45678-0 Ulysses $34.00
{Price}
1-22-233700-0 Visual $25.00
Basic
06/22/2024 63
Normalization: Functional Dependencies
Example 2
PubID PubName PubPhone Table Scheme: {PubID, PubName, PubPhone}
1 Big House 999-999-9999 Functional Dependencies: {PubId}
2 Small House 123-456-7890 {PubPhone}
3 Alpha Press 111-111-1111 {PubId}
{PubName}
{PubName, PubPhone}
Example 3 {PubID}
AuID AuName AuPhone Table Scheme: {AuID, AuName, AuPhone}
1 Sleepy 321-321-1111 Functional Dependencies: {AuId}
2 Snoopy 232-234-1234 {AuPhone}
3 Grumpy 665-235-6532 {AuId}
4 Jones 123-333-3333 {AuName}
5 Smith 654-223-3455 {AuName, AuPhone}
6 Joyce 666-666-6666 {AuID}
7 Roman 444-444-4444
06/22/2024 64
Normalization: Dependency Diagram
The primary key components are bold, underlined, and shaded in a
different color.
The arrows above entities indicate all desirable dependencies, i.e.,
dependencies that are based on PK.
The arrows below the dependency diagram indicate less desirable
dependencies -- partial dependencies and transitive dependencies
Example:
06/22/2024 65
Normalization: Functional Dependencies(Example)
I. Database to track reviews of papers submitted to an academic
conference. Prospective authors submit papers for review and
possible acceptance in the published conference proceedings.
Details of the entities:
Author information includes a unique author number, a name, a mailing
address, and a unique (optional) email address.
Paper information includes the primary author, the paper number, the title,
the abstract, and review status (pending, accepted, rejected)
Reviewer information includes the reviewer number, the name, the mailing
address, and a unique (optional) email address
A completed review includes the reviewer number, the date, the paper
number, comments to the authors, comments to the program chairperson,
and ratings (overall, originality, correctness, style, clarity)
06/22/2024 66
Normalization: Functional Dependencies(Example)
Functional Dependencies
AuthNo AuthName, AuthEmail, AuthAddress
AuthEmail AuthNo
PaperNo Primary-AuthNo, Title, Abstract, Status
RevNo RevName, RevEmail, RevAddress
RevEmail RevNo
RevNo, PaperNo AuthComm, Prog-Comm, Date,
Rating1, Rating2, Rating3, Rating4, Rating5
06/22/2024 67
Normalization: Second Normal Form (2NF)
For a table to be in 2NF, there are two requirements:
The database is in first normal form
All nonkey attributes in the table must be functionally dependent on the
entire primary key
Note: Remember that we are dealing with non-key attributes
06/22/2024 68
Normalization: Second Normal Form (2NF)
Example 2 (Not 2NF)
Scheme {City, Street, HouseNumber, HouseColor, CityPopulation}
1. key {City, Street, HouseNumber}
2. {City, Street, HouseNumber} {HouseColor}
3. {City} {CityPopulation}
4. CityPopulation does not belong to any key.
5. CityPopulation is functionally dependent on the City which is a proper subset
of the key
06/22/2024 69
Normalization: 2NF( Decomposition)
1. If a data item is fully functionally dependent on only a part of the
primary key, move that data item and that part of the primary key
to a new table.
2. If other data items are functionally dependent on the same part of
the key, place them in the new table also
3. Make the partial primary key copied from the original table the
primary key for the new table.
(Place all items that appear in the repeating group in a new table)
Example 1 (Convert to 2NF)
Old Scheme {StudentId, CourseId, StudentName, CourseTitle, Grade}
New Scheme {StudentId, StudentName}
New Scheme {CourseId, CourseTitle}
New Scheme {StudentId, CourseId, Grade}
06/22/2024 70
Normalization: 2NF( Decomposition)
Example 2 (Convert to 2NF)
Old Scheme {StudioID, Movie, Budget, StudioCity}
New Scheme {Movie, StudioID, Budget}
New Scheme {Studio, City}
06/22/2024 71
Normalization: Third Normal Form (3NF)
This form dictates that all non-key attributes of a table
must be functionally dependent on a candidate key such
that there are no interdependencies among non-key
attributes i.e., there should be no transitive dependencies
06/22/2024 73
Normalization:3NF(Decomposition)
1. Move all items involved in transitive dependencies to a new entity.
2. Identify a primary key for the new entity.
3. Place the primary key for the new entity as a foreign key on the
original entity.
Example 1 (Convert to 3NF)
Old Scheme {Title, PubID, BookType, Price }
New Scheme {BookType, Price}
New Scheme {Title, PubID, BookType}
06/22/2024 74
Normalization:3NF( Decomposition)
Example 2 (Convert to 3NF)
Old Scheme {StudioID, StudioCity, CityTemp}
New Scheme {StudioID, StudioCity}
New Scheme {StudioCity, CityTemp}
BuildingID Contractor
Example 3 (Convert to 3NF) 100 Randolph
Old Scheme {BuildingID, Contractor, Fee} 150 Ingersoll
New Scheme {BuildingID, Contractor} 200 Randolph
Contractor Fee
Randolph 1200
Ingersoll 1100
Pitkin 1100
06/22/2024 75
Normalization: Boyce-Codd Normal Form (BCNF)
BCNF does not allow dependencies between attributes that belong to candidate
keys.
BCNF is a refinement of the third normal form in which it drops the restriction
of a non-key attribute from the 3rd normal form.
Third normal form and BCNF are not same if following conditions are true:
The table has two or more candidate keys
At least two of the candidate keys are composed of more than one attribute
The keys are not disjoint i.e. The composite candidate keys share some attributes
Example 1 - Address (Not in BCNF)
Scheme {City, Street, ZipCode}
1. Key1 {City, Street }
2. Key2 {ZipCode, Street}
3. No non-key attribute hence 3NF
4. {City, Street} {ZipCode}
5. {ZipCode} {City}
6. Dependency between attributes belonging to a key
06/22/2024 76
Normalization: Boyce-Codd Normal Form (BCNF)
Example 2 - Movie (Not in BCNF)
Scheme {MovieTitle, StudioID, MovieID, ActorName, Role, Payment }
1. Key1 {MovieTitle, StudioID, ActorName}
2. Key2 {MovieID, ActorName}
3. Both role and payment functionally depend on both candidate keys thus 3NF
4. {MovieID} {MovieTitle}
5. Dependency between MovieID & MovieTitle Violates BCNF
06/22/2024 77
Normalization: BCNF(Decomposition)
1. Place the two candidate primary keys in separate entities
2. Place each of the remaining data items in one of the resulting
entities according to its dependency on the primary key.
Example 1 (Convert to BCNF)
Old Scheme {City, Street, ZipCode }
New Scheme1 {ZipCode, Street}
New Scheme2 {City, Street}
Loss of relation {ZipCode} {City}
Alternate New Scheme1 {ZipCode, Street }
Alternate New Scheme2 {ZipCode, City}
06/22/2024 78
Normalization: Decomposition (Loss of
Information)
1. If decomposition does not cause any loss of information, it is
called a lossless decomposition.
2. If a decomposition does not cause any dependencies to be lost it
is called a dependency-preserving decomposition.
3. Any table scheme can be decomposed in a lossless way into a
collection of smaller schemas that are in BCNF form. However,
the dependency preservation is not guaranteed.
4. Any table can be decomposed in a lossless way into 3rd normal
form that also preserves the dependencies.
• 3NF may be better than BCNF in some cases
Use
Useyour
yourown
ownjudgment
judgmentwhen
whendecomposing
decomposingschemas
schemas
06/22/2024 79
Normalization: BCNF(Decomposition)
Example 2 (Convert to BCNF)
Old Scheme {MovieTitle, StudioID, MovieID, ActorName, Role, Payment }
New Scheme {MovieID, ActorName, Role, Payment}
New Scheme {MovieTitle, StudioID, ActorName}
Loss of relation {MovieID} {MovieTitle}
New Scheme {MovieID, ActorName, Role, Payment}
New Scheme {MovieID, MovieTitle}
We got the {MovieID} {MovieTitle} relationship back
Example 3 (Convert to BCNF)
Old Scheme {Client, Problem, Consultant}
New Scheme {Client, Consultant}
New Scheme {Client, Problem}
Loss or Relation {Consultant, Problem}
New Schema {Client, Consultant}
New Schema {Consultant, Problem}
06/22/2024 80
Normalization: Fourth Normal Form (4NF)
Fourth normal form eliminates independent many-to-one relationships
between columns.
To be in Fourth Normal Form,
a relation must first be in Boyce-Codd Normal Form.
a given relation may not contain more than one multi-valued attribute.
Example (Not in 4NF)
Scheme {MovieName, ScreeningCity, Genre)
Primary Key: {MovieName, ScreeningCity, Genre)
1. All columns are a part of the only candidate key, hence BCNF
2. Many Movies can have the same Genre Movie ScreeningCity Genre
3. Many Cities can have the same movie Hard Code Los Angles Comedy
Hard Code New York Comedy
4. Violates 4NF
Bill Durham Santa Cruz Drama
06/22/2024 81
Normalization: Fourth Normal Form (4NF)
Manager Child Employee
Example 2 (Not in 4NF)
Jim Beth Alice
Scheme {Manager, Child, Employee}
Mary Bob Jane
1. Primary Key {Manager, Child, Employee}
Mary NULL Adam
2. Each manager can have more than one child
3. Each manager can supervise more than one employee
4. 4NF Violated
06/22/2024 82
Normalization:4NF(Decomposition)
1. Move the two multi-valued relations to separate tables
2. Identify a primary key for each of the new entity.
06/22/2024 83
Normalization: 4NF(Decomposition)
Example 2 (Convert to 4NF) Manager Child Manager Employee
Old Scheme {Manager, Child, Employee} Jim Beth Jim Alice
06/22/2024 84
Normalization:Fifth Normal Form (5NF)
Fifth normal form applies to M-Way relationships.
In 5NF all tables are broken into as many tables as possible in order to
avoid redundancy.
Once it is in fifth normal form it cannot be broken into smaller
relations without changing the facts or the meaning.
06/22/2024 85
.
Thank You!
06/22/2024 86