0% found this document useful (0 votes)
24 views35 pages

20-Normalization - BCNF-02-09-2024

sbdbsd

Uploaded by

Hemesh R
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views35 pages

20-Normalization - BCNF-02-09-2024

sbdbsd

Uploaded by

Hemesh R
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 35

Database Normalization

Definition
• This is the process of minimizing redundancy from a
relation or set of relations..
• This involves restructuring the tables to successively
meeting higher forms of Normalization.
• A properly normalized database should have the
following characteristics
– Scalar values in each fields
– Absence of redundancy.
– Minimal use of null values.
– Minimal loss of information.
Redundancy and Data Anomalies
Redundant data is where we have stored the same ‘information’ more than once.
i.e., the redundant data could be removed without the loss of information.

Example: We have the following relation that contains staff and department details:
Emp_Id Job Dept Dname City
SL10 Salesman 10 Sales Chennai
SA51 Manager 20 Accounts Bangalore
DS40 Clerk 20 Accounts Hyderabad
OS45 Clerk 30 Operations Vellore

Insert Anomaly: We can’t insert a dept without inserting a member of staff that
works in that department
Update Anomaly: We could change the name of the dept that SA51 works in
without simultaneously changing the dept that DS40 works in.
Deletion Anomaly: By removing employee SL10 we have removed all information
pertaining to the Sales dept.
Repeating Groups
• A repeating group is an attribute (or set of attributes) that can have
more than one value for a primary key value.
• Example: We have the following relation that contains staff and
department details and a list of telephone contact numbers for each
member of staff.
Emp_id job dept dname city contact number
SL10 Salesman 10 Sales Chennai 018111777, 018111888, 079311122
SA51 Manager 20 Accounts Bangalore 017111777
DS40 Clerk 20 Accounts Hyderabad
OS45 Clerk 30 Operations Vellore 079311555

• Repeating Groups are not allowed in a relational design, since all


attributes have to be ‘atomic’ - i.e., there can only be one value per cell
in a table!
Functional Dependency
• Formal Definition: Attribute B is functionally dependent upon
attribute A (or a collection of attributes) if a value of A determines a
single value of attribute B at any one time.
• Formal Notation: A  B This should be read as ‘A determines B’
or ‘B is functionally dependent on A’.
• Here A is called the determinant and B is called the object of the
determinant.

Example: Functional Dependencies


Emp_id job dept dname Emp_id  job
SL10 Salesman 10 Sales Emp_id  dept
SA51 Manager 20 Accounts
Emp_id  dname
DS40 Clerk 20 Accounts
OS45 Clerk 30 Operations dept  dname
Functional Dependency
• Compound Determinants: If more than one attribute is necessary to
determine another attribute in an entity, then such a determinant is
termed a composite determinant.
• Full Functional Dependency: Only of relevance with composite
determinants. This is the situation when it is necessary to use all the
attributes of the composite determinant to identify its object uniquely.

Example:
Full Functional Dependencies
order# line# qty price (Order#, line#)  qty
A001 001 10 200
(Order#, line#)  price
A002 001 20 400
A002 002 20 800
A004 001 15 300
Functional Dependency
• Partial Functional Dependency: This is the situation that exists
if it is necessary to only use a subset of the attributes of the
composite determinant to identify its object uniquely.
Example:
student# unit# room grade Partial Functional Dependencies
9900100 A01 TH224 2 (student#, unit#)  grade
9900010 A01 TH224 14
9901011 A02 JS075 3 Partial Functional Dependencies
9900001 A01 TH224 16 unit#  room

Repetition of data!
Transitive Dependency
• Definition: A transitive dependency exists when there is an
intermediate functional dependency.
• Formal Notation: If A  B and B  C, then it can be stated
that the following transitive dependency exists: A  B  C

Example: Transitive Dependencies


Emp_id job dept dname Emp_id  dept
SL10 Salesman 10 Sales dept  dname
SA51 Manager 20 Accounts
DS40 Clerk 20 Accounts Emp_id  dept  dname
OS45 Clerk 30 Operations

Repetition of data!
Levels of Normalization
• Levels of normalization based on the amount of
redundancy in the database.
• Various levels of normalization are:
– First Normal Form (1NF)

Redundancy
– Second Normal Form (2NF)
– Third Normal Form (3NF)

Number of Tables
– Boyce-Codd Normal Form (BCNF)

Complexity
Fourth Normal Form (4NF)
– Fifth Normal Form (5NF)
– Domain Key Normal Form (DKNF)

Most
Mostdatabases
databasesshould
shouldbe
be3NF
3NFororBCNF
BCNFin inorder
orderto
toavoid
avoid
the
thedatabase
databaseanomalies.
anomalies.
Levels of Normalization
1NF
2NF
3NF
4NF
5NF
DKNF

Each
Eachhigher
higherlevel
levelisisaasubset
subsetof
ofthe
thelower
lowerlevel
level
First Normal Form
(1NF)
A table is considered to be in 1NF if all the fields contain
only scalar values (as opposed to list of values).
Example (Not 1NF)

ISBN Title AuName AuPhone PubName PubPhone Price

0-321-32132-1 Balloon Sleepy, 321-321-1111, Small House 714-000-0000 $34.00


Snoopy, 232-234-1234,
Grumpy 665-235-6532

0-55-123456-9 Main Street Jones, 123-333-3333, Small House 714-000-0000 $22.95


Smith 654-223-3455
0-123-45678-0 Ulysses Joyce 666-666-6666 Alpha Press 999-999-9999 $34.00

1-22-233700-0 Visual Roman 444-444-4444 Big House 123-456-7890 $25.00


Basic

AuName
AuNameand
andAuPhone
AuPhonecolumns
columnsare
arenot
notscalar
scalar
1NF - Decomposition
1. Place all items that appear in the repeating group in a new table
2. Designate a primary key for each new table produced.
3. Duplicate in the new table the primary key of the table from
which the repeating group was extracted or vice versa.
Example (1NF)

ISBN AuName AuPhone

0-321-32132-1 Sleepy 321-321-1111

ISBN Title PubName PubPhone Price 0-321-32132-1 Snoopy 232-234-1234

0-321-32132-1 Balloon Small House 714-000-0000 $34.00 0-321-32132-1 Grumpy 665-235-6532

0-55-123456-9 Main Street Small House 714-000-0000 $22.95 0-55-123456-9 Jones 123-333-3333

0-123-45678-0 Ulysses Alpha Press 999-999-9999 $34.00 0-55-123456-9 Smith 654-223-3455

1-22-233700-0 Visual Big House 123-456-7890 $25.00 0-123-45678-0 Joyce 666-666-6666


Basic
1-22-233700-0 Roman 444-444-4444
Functional
Dependencies
1. If one set of attributes in a table determines another set of
attributes in the table, then the second set of attributes is
said to be functionally dependent on the first set of
attributes.

Example 1

ISBN Title Price Table Scheme: {ISBN, Title, Price}


0-321-32132-1 Balloon $34.00 Functional Dependencies: {ISBN}  {Title}
0-55-123456-9 Main Street $22.95 {ISBN} 
0-123-45678-0 Ulysses $34.00
{Price}
1-22-233700-0 Visual $25.00
Basic
Functional
Dependencies
Example 2
PubID PubName PubPhone Table Scheme: {PubID, PubName, PubPhone}
1 Big House 999-999-9999 Functional Dependencies: {PubId}  {PubPhone}
2 Small House 123-456-7890
{PubId} 
3 Alpha Press 111-111-1111
{PubName}
{PubName, PubPhone}  {PubID}
Example 3
AuID AuName AuPhone
1 Sleepy 321-321-1111
Table Scheme: {AuID, AuName, AuPhone}
2 Snoopy 232-234-1234 Functional Dependencies: {AuId}  {AuPhone}
3 Grumpy 665-235-6532 {AuId} 
4 Jones 123-333-3333 {AuName}
5 Smith 654-223-3455 {AuName, AuPhone}  {AuID}
6 Joyce 666-666-6666

7 Roman 444-444-4444
FD – Example
Database to track reviews of papers submitted to an academic
conference. Prospective authors submit papers for review and possible
acceptance in the published conference proceedings. Details of the
entities
– Author information includes a unique author number, a name, a mailing
address, and a unique (optional) email address.
– Paper information includes the primary author, the paper number, the
title, the abstract, and review status (pending, accepted, rejected)
– Reviewer information includes the reviewer number, the name, the
mailing address, and a unique (optional) email address
– A completed review includes the reviewer number, the date, the paper
number, comments to the authors, comments to the program chairperson,
and ratings (overall, originality, correctness, style, clarity)
FD – Example
Functional Dependencies
– AuthNo  {AuthName, AuthEmail, AuthAddress}
– AuthEmail  {AuthNo}
– PaperNo  {Primary-AuthNo, Title, Abstract, Status}
– RevNo  {RevName, RevEmail, RevAddress}
– RevEmail  {RevNo}
– RevNo, PaperNo  {AuthComm, Prog-Comm, Date,
Rating1, Rating2, Rating3, Rating4, Rating5}
Second Normal Form
(2NF)
For a table to be in 2NF, there are two requirements
– The database is in first normal form
– All nonkey attributes in the table must be functionally dependent on the
primary key
Note: Remember that we are dealing with non-key attributes

Example 1 (Not 2NF)


Scheme  {Title, PubId, AuId, Price, AuAddress}
1. Key  {Title, PubId, AuId}
2. {Title, PubId, AuID}  {Price}
3. {AuID}  {AuAddress}
4. AuAddress does not belong to a key
5. AuAddress functionally depends on AuId which is a subset of a key
Second Normal Form
Example 2 (Not 2NF)
(2NF)
Scheme  {City, Street, HouseNumber, HouseColor, CityPopulation}
1. key  {City, Street, HouseNumber}
2. {City, Street, HouseNumber}  {HouseColor}
3. {City}  {CityPopulation}
4. CityPopulation does not belong to any key.
5. CityPopulation is functionally dependent on the City which is a proper
subset of the key

Example 3 (Not 2NF)


Scheme  {studio, movie, budget, studio_city}
6. Key  {studio, movie}
7. {studio, movie}  {budget}
8. {studio}  {studio_city}
9. studio_city is not a part of a key
10. studio_city functionally depends on studio which is a proper subset of
the key
2NF - Decomposition
1. If a data item is fully functionally dependent on only a part of the
primary key, move that data item and that part of the primary key to a
new table.
2. If other data items are functionally dependent on the same part of the
key, place them in the new table also.
3. Make the partial primary key copied from the original table the
primary key for the new table. Place all items that appear in the
repeating group in a new table
Example 1 (Convert to 2NF)
Old Scheme  {Title, PubId, AuId, Price, AuAddress}
New Scheme  {Title, PubId, AuId, Price}
New Scheme  {AuId, AuAddress}
2NF - Decomposition
Third Normal Form
(3NF)
This form indicates that all non-key attributes of a table must be functionally
dependent on a candidate key i.e. there can be no interdependencies among
non-key attributes.

For a table to be in 3NF, there are two requirements


– The table should be second normal form
– No attribute is transitively dependent on the primary key

Example (Not in 3NF)


Scheme  {Title, PubID, PageCount, Price }
1. Key  {Title, PubId}
2. {Title, PubId}  {PageCount}
3. {PageCount}  {Price}
4. Both Price and PageCount depend on a key hence 2NF
5. Transitively {Title, PubID}  {Price} hence not in 3NF
Third Normal Form
Example 2 (Not in 3NF)
(3NF)
Scheme  {Studio, StudioCity, CityTemp}
1. Primary Key  {Studio}
2. {Studio}  {StudioCity}
3. {StudioCity}  {CityTemp}
4. {Studio}  {CityTemp}
5. Both StudioCity and CityTemp depend on the entire key hence 2NF
6. CityTemp transitively depends on Studio hence violates 3NF
Buildin Contractor Fee
Example 3 (Not in 3NF) gID
100 Randolp 120
Scheme  {BuildingID, Contractor, Fee} 150
h
Ingersoll
0
110
7. Primary Key  {BuildingID} 200 Randolp 0
120
8. {BuildingID}  {Contractor} h 0
250 Pitkin 110
9. {Contractor}  {Fee} 0
300 Randolp 120
10. {BuildingID}  {Fee} h 0
11. Fee transitively depends on the BuildingID
12. Both Contractor and Fee depend on the entire key hence 2NF
3NF - Decomposition
1. Move all items involved in transitive dependencies to a new entity.
2. Identify a primary key for the new entity.
3. Place the primary key for the new entity as a foreign key on the
original entity.
Example 1 (Convert to 3NF)
Old Scheme  {Title, PubID, PageCount, Price }
New Scheme  {PubID, PageCount, Price}
New Scheme  {Title, PubID, PageCount}
3NF - Decomposition
3NF - Decomposition
Example 2 (Convert to 3NF)
Old Scheme  {Studio, StudioCity, CityTemp}
New Scheme  {Studio, StudioCity}
New Scheme  {StudioCity, CityTemp}

Example 3 (Convert to 3NF) Buildin Contractor Contractor Fee


Old Scheme  {BuildingID, Contractor, Fee} gID
100 Randolp Randolp 120
h h 0
New Scheme  {BuildingID, Contractor} 150 Ingersoll Ingersoll 110
200 Randolp 0
New Scheme  {Contractor, Fee} Pitkin 110
h 0
250 Pitkin
300 Randolp
h
Boyce-Codd Normal Form

(BCNF)
It is free from redundancy.
• BCNF does not allow dependencies between attributes that belong to
candidate keys.
• BCNF is a refinement of the third normal form in which it drops the
restriction of a non-key attribute from the 3rd normal form.
• Every Binary Relation ( a Relation with only 2 attributes ) is always in BCNF.
• A table complies with BCNF if it is in 3NF and for every functional
dependency X->Y, X should be the super key of the table.

Example 1 - Address (Not in BCNF)


Scheme  {City, Street, ZipCode }
1. Key1  {City, Street }
2. Key2  {ZipCode, Street}
3. No non-key attribute hence 3NF
4. {City, Street}  {ZipCode}
5. {ZipCode}  {City}
6. Dependency between attributes belonging to a key
BCNF - Decomposition
1. Place the two candidate primary keys in separate entities
2. Place each of the remaining data items in one of the
resulting entities according to its dependency on the
primary key.
Example 1 (Convert to BCNF)
Old Scheme  {City, Street, ZipCode }
New Scheme1  {ZipCode, Street}
New Scheme2  {City, Street}
Loss of relation {ZipCode}  {City}
Alternate New Scheme1  {ZipCode, Street }
Alternate New Scheme2  {ZipCode, City}
Decomposition – Loss of
Information
1. If decomposition does not cause any loss of information it is called a
lossless decomposition.
2. If a decomposition does not cause any dependencies to be lost it is
called a dependency-preserving decomposition.
3. Any table schema can be decomposed in a lossless way into a
collection of smaller schemas that are in BCNF form. However the
dependency preservation is not guaranteed.
4. Any table can be decomposed in a lossless way into 3rd normal form
that also preserves the dependencies.
• 3NF may be better than BCNF in some cases

Use
Useyour
yourown
ownjudgment
judgmentwhen
whendecomposing
decomposingschemas
schemas
Fourth Normal Form

(4NF)
Fourth normal form eliminates independent many-to-one relationships
between columns.
• To be in Fourth Normal Form,
– a relation must first be in Boyce-Codd Normal Form.
– a given relation may not contain more than one multi-valued attribute.

Example (Not in 4NF)


Scheme  {MovieName, ScreeningCity, Genre)
Primary Key: {MovieName, ScreeningCity, Genre)
1. All columns are a part of the only candidate key, hence BCNF
2. Many Movies can have the same Genre
3. Many Cities can have the same movie Movie Screening Genre
4. Violates 4NF Hard Code
City
Los Angles Comedy

Hard Code New York Comedy

Bill Durham Santa Cruz Drama

Bill Durham Durham Drama

The Code New York Horror


Warrier
Fourth Normal Form
(4NF)
Example 2 (Not in 4NF) Manage Chil Employ
r d ee
Scheme  {Manager, Child, Employee} Jim Beth Alice

1. Primary Key  {Manager, Child, Employee} Mary Bob Jane

2. Each manager can have more than one child Mary NULL Adam
3. Each manager can supervise more than one employee
4. 4NF Violated

Example 3 (Not in 4NF)


Scheme  {Employee, Skill, ForeignLanguage}
5. Primary Key  {Employee, Skill, Language }
6. Each employee can speak multiple languages
7. Each employee can have multiple skills
Employ Skill Langua
8. Thus violates 4NF ee ge
1234 Cooking French

1234 Cooking German


1453 Carpentr Spanish
y
1453 Cooking Spanish
2345 Cooking Spanish
4NF - Decomposition
1. Move the two multi-valued relations to separate tables
2. Identify a primary key for each of the new entity.

Example 1 (Convert to 3NF)


Old Scheme  {MovieName, ScreeningCity, Genre}
New Scheme  {MovieName, ScreeningCity}
New Scheme  {MovieName, Genre}

Movie Genre Movie Screening


City
Hard Code Comedy Hard Code Los Angles

Bill Durham Drama Hard Code New York

The Code Horror Bill Durham Santa Cruz


Warrier
Bill Durham Durham

The Code New York


Warrier
4NF - Decomposition
Example 2 (Convert to 4NF) Manage Chil Manage Employ
r d r ee
Old Scheme  {Manager, Child, Employee} Jim Beth Jim Alice

New Scheme  {Manager, Child} Mary Bob Mary Jane

New Scheme  {Manager, Employee} Mary Adam

Example 3 (Convert to 4NF)


Old Scheme  {Employee, Skill, ForeignLanguage}
New Scheme  {Employee, Skill}
New Scheme  {Employee, ForeignLanguage}

Employ Skill Employ Langua


ee ee ge
1234 Cooking 1234 French

1453 Carpentr 1234 German


y
1453 Cooking 1453 Spanish

2345 Cooking 2345 Spanish


Fifth Normal Form

(5NF)
Fifth normal form is satisfied when all tables are broken into as
many tables as possible in order to avoid redundancy. Once it is
in fifth normal form it cannot be broken into smaller relations
without changing the facts or the meaning.
Domain Key Normal Form
(DKNF)
• The relation is in DKNF when there can be no insertion or
deletion anomalies in the database.

You might also like