Chapter 15 discusses normalization for relational databases, outlining informal design guidelines, functional dependencies, and various normal forms including 1NF, 2NF, 3NF, BCNF, 4NF, and 5NF. It emphasizes the importance of reducing redundancy, avoiding update anomalies, and ensuring clear semantics in relation schemas. The chapter also provides definitions and examples to illustrate the concepts of keys and normalization processes.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
13 views38 pages
7 Normalization For Relational Databases
Chapter 15 discusses normalization for relational databases, outlining informal design guidelines, functional dependencies, and various normal forms including 1NF, 2NF, 3NF, BCNF, 4NF, and 5NF. It emphasizes the importance of reducing redundancy, avoiding update anomalies, and ensuring clear semantics in relation schemas. The chapter also provides definitions and examples to illustrate the concepts of keys and normalization processes.
Chapter 15 Outline Informal Design Guidelines for Relation Schemas Functional Dependencies Normal Forms Based on Primary Keys General Definitions of Second and Third Normal Forms Boyce-Codd Normal Form
Introduction Levels at which we can discuss goodness of relation schemas Logical (or conceptual) level Implementation (or physical storage) level Approaches to database design: Bottom-up or top-down
Informal Design Guidelines for Relation Schemas Measures of quality Making sure attribute semantics are clear Reducing redundant information in tuples Reducing NULL values in tuples Disallowing possibility of generating spurious tuples
Imparting Clear Semantics to Attributes in Relations Semantics of a relation Meaning resulting from interpretation of attribute values in a tuple Easier to explain semantics of relation Indicates better schema design
Guideline 1 Design relation schema so that it is easy to explain its meaning Do not combine attributes from multiple entity types and relationship types into a single relation Example of violating Guideline 1: Figure 15.3
Redundant Information in Tuples and Update Anomalies Grouping attributes into relation schemas Significant effect on storage space Storing natural joins of base relations leads to update anomalies Types of update anomalies: Insertion Deletion Modification
EXAMPLE OF AN UPDATE ANOMALY Consider the relation: EMP_PROJ ( Emp#, Proj#, Ename, Pname, No_hours) Update Anomaly • Changing the name of project number P1 from “Billing” to “Customer-Accounting” may cause this update to be made for all 100 employees working on project P1 Insert Anomaly • Cannot insert a project unless an employee is assigned to . • Inversely- Cannot insert an employee unless he/she is assigned to a project.
EXAMPLE OF AN UPDATE ANOMALY (2) Delete Anomaly • When a project is deleted, it will result in deleting all the employees who work on that project. Alternately, if an employee is the sole employee on a project, deleting that employee would result in deleting the corresponding project. Design a schema that does not suffer from the insertion, deletion and update anomalies. If there are any present, then note them so that applications can be made to take them into account
Guideline 2 Design base relation schemas so that no update anomalies are present in the relations If any anomalies are present: Note them clearly Make sure that the programs that update the database will operate correctly
NULL Values in Tuples May group many attributes together into a “fat” relation Can end up with many NULLs Problems with NULLs Wasted storage space Problems understanding meaning
Guideline 3 Avoid placing attributes in a base relation whose values may frequently be NULL If NULLs are unavoidable: Make sure that they apply in exceptional cases only, not to a majority of tuples
Generation of Spurious Tuples Figure 15.5(a) Relation schemas EMP_LOCS and EMP_PROJ1 NATURAL JOIN Result produces many more tuples than the original set of tuples in EMP_PROJ Called spurious tuples Represent spurious information that is not valid
Summary and Discussion of Design Guidelines Anomalies cause redundant work to be done Waste of storage space due to NULLs Difficulty of performing operations and joins due to NULL values Generation of invalid and spurious data during joins
Definitions of Keys and Attributes Participating in Keys Definition of superkey and key Candidate key If more than one key in a relation schema • One is primary key • Others are secondary keys
Introduction to Normalization Normalization: Process of decomposing unsatisfactory "bad" relations by breaking up their attributes into smaller relations Normal form: Condition using keys and FDs of a relation to certify whether a relation schema is in a particular normal form 2NF, 3NF, BCNF based on keys and FDs of a relation schema 4NF based on keys, multi-valued dependencies
First Normal Form Disallows composite attributes, multivalued attributes, and nested relations; attributes whose values for an individual tuple are non-atomic Considered to be part of the definition of relation
Examples Second Normal Form {SSN, PNUMBER} HOURS is a full FD since neither SSN HOURS nor PNUMBER HOURS hold {SSN, PNUMBER} ENAME is not a full FD (it is called a partial dependency ) since SSN ENAME also holds A relation schema R is in second normal form (2NF) if every non-prime attribute A in R is fully functionally dependent on the primary key (whole key) or Candidate key
Third Normal Form Definition Transitive functional dependency – if there a set of atribute Z that are neither a primary or candidate key and both X Z and Y Z holds. Examples: SSN DMGRSSN is a transitive FD since SSN DNUMBER and DNUMBER DMGRSSN hold SSN ENAME is non-transitive since there is no set of attributes X where SSN X and X ENAME
BCNF (Boyce-Codd Normal Form) A relation schema R is in Boyce-Codd Normal Form (BCNF) if whenever an FD X A holds in R, then X is a superkey of R Each normal form is strictly stronger than the previous one: • Every 2NF relation is in 1NF • Every 3NF relation is in 2NF • Every BCNF relation is in 3NF
There exist relations that are in 3NF but not in BCNF
The goal is to have each relation in BCNF (or 3NF)
Example Given the relation Book(Book_title, Authorname, Book_type, Listprice, Author_affil, Publisher) The FDs are Book_title Publisher, Book_type Book_type Listprice Authorname Author_affil What normal form the relation in?