0% found this document useful (0 votes)
31 views18 pages

Normalization

Uploaded by

nishant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views18 pages

Normalization

Uploaded by

nishant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18

Normalization

1
Normalization of DB Tables
 Normalization
► Process for evaluating and correcting table structures
• determines the optimal assignments of attributes to entities
► Normalization provides micro view of entities
• focuses on characteristics of specific entities
• may yield additional entities
► Works through a series of stages called normal forms
• 1NF  2NF  3NF  4NF (optional)
► Higher the normal form, slower the database response
• more joins are required to answer end-user queries

 Why normalize?
► Reduce uncontrolled data redundancies
• Help eliminate data anomalies
► Produce controlled redundancies to link tables

2
Example: Need for Normalization
 PRO_NUM is intended to be primary key but contain nulls
 Table entries invite data inconsistencies
► e.g. “Elect. Engineer”, “Elect.Eng.”, “EE”
 Table displays data redundancies that can cause data anomalies
► Update anomalies
• Modifying JOB_CLASS could require many alterations (all the rows for the same EMP_NUM)
► Insertion anomalies
• New employee must be assigned a project
► Deletion anomalies
• If employee quits and a row deleted, other vital data may get lost

Database Systems: Design, Implementation, & Management: Rob & Coronel

3
Normalization: First Normal Form
 First Normal Form (1NF)
► All the primary key attributes are defined
► There are no repeating groups
► All attributes are dependent on the primary key

 Conversion to 1NF
► Objective
• Develop a proper primary key
► Steps
1. Eliminate repeating groups
 fill in the null cells with appropriate data value
2. Identify primary key
 identify attribute(s) that uniquely identifies each row
3. Identify all dependencies
 make sure all attributes are dependent on the primary key

4
Normalization: 1NF example
1. Eliminate repeating groups - Fill in the null cells to make each row define a single entity
2. Identify the primary key - Make sure all attributes are dependent on the primary key

Database Systems: Design, Implementation, & Management: Rob & Coronel 5


Normalization: 1NF example
3. Identify all dependencies (in a Dependency Table)
► Desirable dependencies (arrows above)
• based on primary key (functional dependency)
► Less desirable dependencies (arrows below)
• Partial dependency
 based on part of composite primary key
• Transitive dependency
 one nonprime attribute depends on another nonprime attribute
• Subject to data redundancies and anomalies

Database Systems: Design, Implementation, & Management: Rob & Coronel 6


Normalization: Second Normal Form
 Second Normal Form (2NF)
► It is in 1NF
► There are no partial dependencies

 Conversion to 2NF
► Objective
• Eliminate partial dependencies
► Steps
1. Start with 1NF format
2. Write each key component (w/ partial dependency) on separate line
3. Write original (composite) key on last line
4. Each component is new table
5. Write dependent attributes after each key

1NF (PROJ_NUM, EMP_NUM, PROJ_NAME, EMP_NAME, JOB_CLASS, CHG_HOUR, HOURS)



PROJECT (PROJ_NUM, PROJ_NAME)
EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS, CHG_HOUR)
ASSIGN (PROJ_NUM, EMP_NUM, HOURS)
7
Normalization: 2NF example

Database Systems: Design, Implementation, & Management: Rob & Coronel

8
Normalization: Third Normal Form
 Third Normal Form (3NF)
► It is in 2NF
► There are no transitive dependencies

 Conversion to 3NF
► Objective
• Eliminate transitive dependencies (TP)
► Steps
1. Start with 2NF format
2. Break off the TP pieces and create separate tables

EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS, CHG_HOUR)



EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS)
JOB (JOB_CLASS, CHG_HOUR)

9
Normalization: 3NF example

Database Systems: Design, Implementation, & Management: Rob & Coronel

10
Normalization: Fourth Normal Form
 Forth Normal Form (4NF)
► It is in 3NF
► There are no multiple sets of independent multi-valued dependencies
► Infrequently needed
• e.g. COURSE has multiple texts and multiple instructors
(texts for a course are not decided by instructor)

 Conversion to 4NF
1. Identify multiple multi-valued attributes
2. Create separate tables containing each of multi-valued attributes

COURSE CRS_TEXT
S511 DB design
COURSE CRS_TEXT CRS_INSTRUCTOR
S511 Inside Access 2007
S511 DB design Jones
S511 DB design Smith
COURSE CRS_INSTRUCTOR
S511 Inside Access 2007 Jones
S511 Jones
S511 Inside Access 2007 Smith
S511 Smith

11
Additional Table Enhancement
 Adhere to naming conventions
 Use transaction code instead of composite primary key when appropriate
► e.g. ASG_NUM in ASSIGN
 Use simple attributes
► e.g. EMP_LNAME, EMP_FNAME, EMP_INIT in EMPLOYEE
 Add attributes to facilitate information extraction
► e.g. EMP_NUM in PROJECT to indicate project manager
► e.g. ASG_CHG_HR in ASSIGN for historical accuracy of data
 Allow data controlled data redundancies
► e.g. ASG_CHG_AMOUNT in ASSIGN (derived attribute)

PROJECT (PROJ_NUM, PROJ_NAME)


JOB (JOB_CLASS, CHG_HOUR)
ASSIGN (PROJ_NUM, EMP_NUM, HOURS)
EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS)

PROJECT (PROJ_NUM, PROJ_NAME, EMP_NUM)
JOB (JOB_CODE, JOB_DESCRIPTION, JOB_CHG_HR)
ASSIGN (ASG_NUM, ASG_DATE, PROJ_NUM, EMP_NUM, ASG_HRS, ASG_CHG_HR, ASG_CHG_AMOUNT)
EMPLOYEE (EMP_NUM, EMP_LNAME, EMP_FNAME, EMP_INIT, EMP_HIREDATE, JOB_CODE)

12
Denormalization
 Normalization is one of many database design goals.

 However, normalized tables result in:


► additional processing
► loss of system speed

 When normalization purity is difficult to sustain due to conflict in:


► design efficiency
► information requirements
► processing speed
 Denormalize by
• use of lower normal form
• use of controlled data redundancies

13
ACID
 ACID stands for: Atomicity, Consistency, Isolation,
Durability
 ACID is the standard in computer science to judge
the reliability of a transaction. In the context of
databases, it is for data transaction.

14
Atomicity
 “all or nothing”
► If a transaction fails in the middle, it will be no transaction.
► If a transaction is aborted, this transaction does not
happen
► If a transaction is committed, the entire transaction should
be completed

 Example
► Money transfer from one bank to another bank.
► Buying the same book in an online bookstore

15
Consistent
 Any transaction must be valid according to all pre-
defined rules (e.g., constraints, triggers).
 Any transaction violates the defined rules will not be
committed.

 Example
► Applying for loan.

16
Isolation
 Determines how transaction integrity is visible to
other users and systems.
 Can many users access the same data at the same
time?
 Will one transaction block another transaction?

 Example
► Watching a video, can two users access the video at the
same time?
► Withdrawing money, can you and your family member
withdraw money from the same bank account?
17
Durability
 It guarantees that transactions that have committed
will survive permanently, even during the power loss
and other emergent situations.
 Transaction logs are used to enforce the durability.

 Example
► Booking a flight ticket online: even the system crashes, the
ticket if committed for booking, will be booked.

18

You might also like