0% found this document useful (0 votes)
21 views30 pages

Boyce

The document discusses database normalization, specifically Boyce-Codd Normal Form (BCNF) and Fourth Normal Form (4NF), detailing the steps required to achieve these forms from Third Normal Form (3NF). It explains the differences between BCNF and 4NF, emphasizing that BCNF allows multi-valued dependencies while 4NF does not. Additionally, the document covers logical database design, transformation of entity-relationship diagrams into relational tables, and various types of relationships within the relational model.

Uploaded by

mohiniranjan0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views30 pages

Boyce

The document discusses database normalization, specifically Boyce-Codd Normal Form (BCNF) and Fourth Normal Form (4NF), detailing the steps required to achieve these forms from Third Normal Form (3NF). It explains the differences between BCNF and 4NF, emphasizing that BCNF allows multi-valued dependencies while 4NF does not. Additionally, the document covers logical database design, transformation of entity-relationship diagrams into relational tables, and various types of relationships within the relational model.

Uploaded by

mohiniranjan0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 30

Boyce-Codd Normal Form (BCNF or 3.

5NF)

BCNF stands for “Boyce-Codd Normal Form”. This normal form also known as
the 3.5 Normal form of database normalization. To achieve BCNF, the database
must be already achieved to third normal form. Then following steps should be
done to achieve the BCNF.

1. Identify all candidate keys in the relations


2. Identify all functional dependencies in the relations.
3. If there are functional dependencies in the relation, where their determinants
are not candidate keys for the relation, remove the functional dependencies by
placing them in a new relation along with a copy of their determinant.

Fourth Normal Form

Database must be in third normal form, before normalizing it to the fourth normal
form. If the database is already in third normal form, next step should be to remove
the multi-valued dependencies. (If one or more rows imply the presence of one or
more other rows in the same table, it is called multi-valued dependency.)

What is the difference between BCNF and 4NF (Fourth Normal Form)?

• Database must be already achieved to 3NF to take it to BCNF, but

database must be in 3NF and BCNF, to reach 4NF.

• In fourth normal form, there are no multi-valued dependencies of the

tables, but in BCNF, there can be multi-valued dependency data in the


tables.

6-relationalmodel.htm; updated 2/15/01

LOGICAL DATABASE DESIGN


and the RELATIONAL MODEL

Introduction.

 This set of notes focuses on logical database design while emphasizing


the Relational Model as an implementation DBMS software platform.
 We focus on the attributes of the Relational Model.
 We also focus on the transformation of an ER model diagram into a set
of relational table structures.
 Many case tools can be used to automate this transformation; however, you
need to understand the nature of the transformation.
 This chapter also covers Normalization which is a theoretical process for
determining if a table structure is going to be "well-structured" and easily
maintainable.

Review of Attributes of the Relational Data Model.

 The relational model is credited to E. F. Codd (1970) who was working at


IBM
 Basic Definitions:
o Data Structure: Data are organized into tables (often
called relations, not to be confused with relationships) with rows and
columns.
o Data Manipulation: The structured query language (SQL) is used to
manipulate data in the tables.
o Data Integrity: The DBMS has facilities to enforce business rules to
maintain data integrity, e.g., not to delete a CUSTOMER row when
there is a corresponding outstanding row in
the CUSTOMER_ORDERS table.
 Relational Data Structure:
o A table is a two-dimensional representation of data.
o Each column is named and represents a field..
o A table can have any number of rows (within physical storage
limitations).
o Each row is a record.
o We could delete all of the rows and the table would still exist.
o The shorthand notation to express the structure of a table is:

TABLE_NAME( PrimaryKeyField1, Field2, Field3, ... )

example:
EMPLOYEE( EmpID, EmpName, Department, DateHired )

Note that the "..." simply means there could be additional fields.

 Relational Keys:
o Primary Key: column (or combination of columns) that uniquely
identifies a row. Weunderline the primary key.
o Composite Key: a primary key that includes more than one column.
o Foreign Key: a field that links one table to another table. A table can
have an unlimited number of foreign keys linking to other tables. We
use a dashed underlinefor foreign keys.
 Properties of Tables (Relations):
o Each table in a database has a unique name.
o Each entry in a table at the intersection of a row/column can only
store a single value (principle of atomicity). No multivalued fields
are allowed in a table.
o Each row is unique.
o Each column has a unique field name.
o The sequence of rows as well as columns is insignificant and can be
interchanged without changing the data.

Transforming Extended Entity-Relationship (EER) Diagrams to Relational


Tables.

 Steps in Conversion:
o 1. Represent entities.
o 2. Represent relationships.
o 3. Normalize the tables.
o 4. Merge tables with the same primary key (based on the existing
database schema for the firm - the system you are currently designing
will probably add to an existing schema).

Entity Types.

 In transforming entities into tables, recall there are three types of entities:
o 1. Regular Entities: have independent existence and generally
represent real-world objects such as customers, products, departments,
etc.
o 2. Weak Entities: cannot exist except with identifying relationships
with an owner (regular) entity type.
o 3. Associate entities: formed from many-to-many relationships
between other entity types. These may be gerunds.

Transforming Entities.

 The figure shown below shows the CUSTOMER entity.


 Transform this entity to a relational table structure by creating a single table
with a column for each attribute and the primary key attribute underlined as
the primary key column for the table.
CUSTOMER(CustomerNumber, CustomerName, AccountBalance,
CustomerPhone)

Transforming Entities with Multivalued Attributes.

 One problem with the CUSTOMER table structure given above is that
theCustomerPhone column is a multivalued attribute.
 A table with sample data might look like the following:

CustomerNumber CustomerName AccountBalance CustomerPhone


555-1212
14112 Mike McClary 114.53
555-3434
15267 Anita Ritter 453.78 555-2424
555-6774
34589 Steve Young 543.30
555-3443

 Note that this table violates the principle of atomicity - there is an attempt to
store more than one CustomerPhone in a field - and this is impossible.
 The solution is to remove the multivalued attribute to a separate table along
with theprimary key (to link back to the original table). The attribute
becomes part of a composite key. The new solution consists of two tables
with sample data as shown here.

CUSTOMER(CustomerNumber, CustomerName, AccountBalance)

CustomerNumber CustomerName AccountBalance


14112 Mike McClary 114.53
15267 Anita Ritter 453.78
34589 Steve Young 543.30

CUSTOMER_PHONE(CustomerNumber, CustomerPhone)

CustomerNumber CustomerPhone
14112 555-1212
14112 555-3434
15267 555-2424
34589 555-6774
34589 555-3443

Transforming Binary 1:N Relationships for Regular Entities.

 The image shown below gives both 1:N and N:N relationships. We will
focus first on the1:N relationship named PlacesOrder.

 First, you transform each entity to a table (or set of tables if there is a
multivalued attribute(s)) as described above. Each entity becomes a single
table.
 Here we show the CUSTOMER and CUSTOMER_PHONE tables that we
created above, assuming the attributes remain unchanged. We also show
the CUSTOMER_ORDERtable assuming the attributes shown now as
columns for the table.
CUSTOMER(CustomerNumber, CustomerName, AccountBalance)

CUSTOMER_PHONE(CustomerNumber, CustomerPhone)

CUSTOMER_ORDER(OrderNumber, OrderDate, OrderAmount,


DeliveryDate, CustomerNumber)

OrderNumber OrderDate OrderAmount DeliveryDate CustomerNumber


54321 2/8/XX 550.96 2/15/XX 34589
76389 2/9/XX 110.43 2/16/XX 15267
76390 2/9/XX 675.75 2/14/XX 34589

 The relationship is implemented implicitly by copying the primary key from


the one-sideof the relationship (CustomerNumber) into the table on
the many side (CUSTOMER_ORDER). Note that here we underline with
a dashed-underline to denote the foreign key or linking column.
 Question: What if the primary key of the one-side is a composite key?
 Answer: Then all columns that comprise the composite key are copied to
the many side to become the foreign key.

Transforming Binary N:N Relationships.

 Refer to the previous figure showing the N:N relationship between


theCUSTOMER_ORDER and PRODUCT entities named Orderline.
 Each entity becomes a single table. We have previously modeled
theCUSTOMER_ORDER table above. The PRODUCT table structure is
given here assuming the attributes shown as columns.

CUSTOMER_ORDER(OrderNumber, OrderDate, OrderAmount,


DeliveryDate, CustomerNumber)

PRODUCT(ProductNumber, Description, DesiredPrice, QtyStocked)

ProductNumber Description DesiredPrice QtyStocked


15567 Table, Dining, Oak 500.00 4
15568 Chair, with Arms, Oak 125.00 25
15569 Chair, without Arms, Oak 110.00 77
62899 Lamp, Brass 175.00 10
ORDERLINE(OrderNumber, ProductNumber, QtyOrdered,
NegotiatedPrice)

OrderNumber ProductNumber QtyOrdered NegotiatedPrice


76389 15567 1 450.00
76389 15568 2 125.00
76389 15569 4 100.00
76390 15567 1 490.00
76390 62899 2 175.00

 Primary keys are identified for each entity.


 The relationship is implemented by creating a table (ORDERLINE) to
represent the relationship symbol. This is known as a linking
table or intersection table.
 A primary composite key for the intersection table is composed of
the primary key from both entities that participate in the relationship.
Note that NO Foreign Keys are required as the primary key for the
intersection table serves to link to the other two tables that participate in the
relationship.
 Relationship data are stored in the intersection table.

Insufficient Primary Keys for Intersection Tables.

 This situation may occur where the relationship


between PATIENT and TREATMENT isN:N. A patient may have the
same treatment many different times. A treatment, e.g., chest X-ray,
appendectomy, etc., may be given to different patients.
 The solution is to include the DateTime attribute as part of the key in the
intersection table.

PATIENT(PatientID, PatientName, RoomLocation)

TREATMENT(TreatmentID, TreatmentDescription)

PATIENT_TREATMENT(PatientID, TreatmentID, DateTimeTreated,


Result)
Transforming Binary 1:1 Relationships.

 These relationships do not arise very often. As before, each entity becomes
a table.
 The selection of a Foreign Key to link the two tables is somewhat arbitrary,
and should be made based upon an analysis of the primary type of query that
will be used to access the data -- termed access path analysis.
 Consider the example shown in this ER diagram.

 Some employees will not have an assigned parking place, but this has no
impact on the table structure. Given below are table structures for
the EMPLOYEE andPARKING_PLACE tables that show two alternative
table structures to link the tables to represent the AssignedTo relationship
by using Foreign Keys.

EMPLOYEE(SSN, EmpName, EmpJobTitle, DateHired, ParkingID)

PARKING_PLACE(ParkingID, CoveredOrNot, Category)

SSN EmpName EmpJobTitle DateHired ParkingID


444-44-4444 Tracey Conner Lecturer 7/15/XX 45
555-55-5555 Doug Bock Professor 8/15/XX NULL
222-22-2222 Mary Sumner Professor 1/4/XX 23
ParkingID CoveredOrNot Category
45 NO B
62 YES C
23 YES A

OR
EMPLOYEE(SSN, EmpName, EmpJobTitle, DateHired)

PARKING_PLACE(ParkingID, CoveredOrNot, Category, SSN)

SSN EmpName EmpJobTitle DateHired


444-44-4444 Tracey Conner Lecturer 7/15/XX
555-55-5555 Doug Bock Professor 8/15/XX
222-22-2222 Mary Sumner Professor 1/4/XX
ParkingID CoveredOrNot Category SSN
45 NO B 444-44-4444
62 YES C NULL
23 YES A 222-22-2222

 Compare the two solutions. Which is better? Depends upon the path
analysis.

Transforming 1:1 and 1:N Unary Relationships.

 Consider the example 1:1 and 1:N Unary relationships shown here.
 We cover the 1:1 Marriage relationship first. The PERSON entity
becomes a table, and the PersonID field is the primary key. A foreign key
field is created namedMarriedToPersonID. The foreign key and primary
key share the same domain of valid values.

PERSON(PersonID, PersonName, MarriedToPersonID)

PersonID PersonName MarriedToPersonID


4 John Brown 6
3 Susan Smith 1
6 Mary Brown 4
1 Barry Smith 3

 The 1:N Unary relationship is modeled exactly the same way as the 1:1
Unary relationship. Consider the Supervise relationship shown in the
figure above. The primary key field is EmployeeID. The relationship is
implemented by creating aManagerEmployeeID field that has the same
domain of valid values as the EmployeeIDfield.
EMPLOYEE(EmployeeID, EmployeeName, BirthDate,
ManagerEmployeeID)

EmployeeID EmployeeName BirthDate ManagerEmployeeID


42 Tom 01/01/XX NULL
67 Bill 05/05/XX 42
75 Alice 07/07/XX 67
45 Susan 05/07/XX 42

Transforming Unary N:N Relationships.

 This is an example of the Bill of Materials N:N Unary relationship.


Each Item is produced by using other Items, e.g., an automobile may be
produced by combining Engine, Chassis, and Body items.

 The Entity becomes a table.


 Relationship also becomes a table.
 The Intersection table has composite key with shared domain.
The PartNumber from the ITEM table becomes part of the primary key in
the BILL_OF_MATERIALS table. We create another column
named ComponentPartNumber (the component is used to build the Item)
as part of the composite key. We also store the quantity of the component
required to build the item.

ITEM(PartNumber, ItemName, ItemCost)

PartNumber ItemName ItemCost


42 Cylinder Head $550
55 Engine $2,500
18 Engine Block $1,800
44 Exhaust Valve $45
... ... ...

BILL_OF_MATERIALS(PartNumber, ComponentPartNumber,
QtyToManufacture)

PartNumber ComponentPartNumber QtyToManufacture


55 42 1
55 18 1
42 44 4
.... .... ....

Transforming Ternary Relationships.

 Recall the simple Shipment relationship shown below.


 When a relationship is ternary, each entity becomes a single table.
 Like the Binary N:N, the Shipment ternary relationship becomes a table.
 All key fields from each entity are included in the intersection table.
 The primary key of intersection table includes key fields from "many" side
of relationship, e.g., for a N:N:N relationship, the primary key from each
entity is used to form a composite key in the intersection table..
 If the relationship is 1:N:N, then the key field from the entity on the "1" side
of the relationship becomes a foreign key field in the intersection table.

VENDOR(VendorNumber, VendorName, ...)

WAREHOUSE(WarehouseNumber, Location, ...)

ITEM(ItemNumber, Description, Cost, ... )

SHIPMENT(VendorNumber, WarehouseNumber, ItemNumber,


DateTimeShipped, QtyShipped)

VendorNumber VendorName
10 ABC Company
20 XYZ Company
.... ....

WarehouseId Location
100 Chicago
200 St. Louis

ItemNumber Description Cost


1000 Table $1500
2000 Chair $250
3000 Lamp $175
ItemNumber WarehouseNumber VendorNumber DateTimeShipped QtyShipped
1000 100 10 01/01/XX 1
2000 200 20 01/01/XX 3
1000 100 10 01/02/XX 1
.... .... .... ....

 Note the inclusion of the DateTimeShipped column as part of the key. This
is to guarantee uniqueness since the combination
of ItemNumber, WarehouseNumber, andVendorNumber are not unique.

Transforming the Complex Shipment Relationship.

 Recall the complex Shipping relationship shown here.


 Here each entity and each gerund become tables.
 The 1:N relationships are implemented with foreign keys in
the SHIPPMENT table.
 The N:N relationship between SHIPMENT and ITEM is implemented as a
separate table.
 The table structures are shown here with example table data.
 Note that the ShipmentNumber primary key ensures uniqueness so the
DateShipped field is not included as part of the key.

VENDOR(VendorNumber, VendorName, ...)


WAREHOUSE(WarehouseNumber, Location, ...)

ITEM(ItemNumber, Description, Cost, ... )

SHIPMENT(ShipmentNumber, WarehouseNumber, VendorNumber,


DateShipped)

ITEMS_ON_SHIPMENT(ShipmentNumber, ItemNumber, QtyShipped)

VendorNumber VendorName
10 ABC Company
20 XYZ Company
.... ....

WarehouseId Location
100 Chicago
200 St. Louis
... ...

ItemNumber Description Cost


1000 Table $1500
2000 Chair $250
3000 Lamp $175
... ... ...

ShipmentNumber WarehouseNumber VendorNumber DateTimeShipped


0001 100 10 01/01/XX
0002 200 20 01/01/XX
0003 100 10 01/02/XX
.... .... ....

ShipmentNumber ItemNumber QtyShipped


0001 1000 1
0002 2000 3
0003 1000 1
... ... ...

Transforming Associative Relationships.

 Consider the Associative Relationship shown below. Recall that this type
of relationship may be modeled as a Gerund with two 1:N relationships
instead of one N:Nrelationship.

 Each entity becomes a table and the primary keys are as indicated in the ER
diagram.
 The 1:N relationships are implemented with foreign keys in the
CERTIFICATE table.

EMPLOYEE(EmployeeID, EmpName, ...)

COURSE(CourseID, Title, PurchasePrice, ...)


CERTIFICATE(CertNumber, DateCompleted, EmployeeID, CourseID)

EmployeeID EmpName
0001 Tom Jones
0002 Barry Manilow
0003 Sandra Bullock

CourseID Title PurchasePrice


CMIS450 Database Design $1500
CMIS464 Applied Operating Systems $1750
CMIS470 Structured Design $1300

CertNumber DateCompleted EmployeeID CourseID


14562 1/15/XX 0001 CMIS450
14563 1/15/XX 0002 CMIS470
14563 1/30/XX 0001 CMIS470

Transforming Generalization Hierarchy (Supertype-Subtype) to


Relationships.

 This is a type of relationship where the database designer has to make a


determination about which of two ways to model the Super-Subtype. Refer
to the diagram below for an example.
 Relational Modeling Solution #1 (the classical solution).
o Each entity becomes a table.
o Each table has the same primary key.
o You can improve searching for the subtype of a supertype by using
a coded subtype field in the supertype table -- shown below as
the EmployeeSubType field.

EMPLOYEE(EmployeeID, Name, DateHired, EmployeeSubType)

HOURLY(EmployeeID, Wage)

SALARIED(EmployeeID, StockOption, Salary)

 Relational Modeling Solution #2 (single table solution).


o Implement the diagram as a single table.
o Allow NULL fields for the various subtypes. In the table structure
shown below,Wage would be NULL for the Salaried employees
and StockOption and Salarywould be NULL for
the Hourly employees.

EMPLOYEE(EmployeeID, Name, DateHired, Wage,


StockOption, Salary)
NORMALIZATION.

Introduction.

 This is a formal method to check tables for potential data storage problems
termed anomalies.
 In order to clarify the discussion, we will formalize the definition of several
terms.
o KEYS: The term KEY is often confusing because it has different
meanings during design and implementation of a system.
o DESIGN: During design, KEY means a combination of one or more
attributes (columns) of a relational table that uniquely identify rows in
the table.
o KEY guarantees uniqueness; no two rows can be identical.
o IMPLEMENTATION: During implementation, the term KEY is a
column on which the DBMS builds an index or other data structure, to
allow quick access to rows. Such keys need not be unique - they may
be secondary keys enabling access to a SET of rows.
o Sometimes the terms Logical Key and Physical Key are used to
distinguish between these two meanings.
o INDEXES: Since a physical key is usually an index, we often use the
term Index for a physical key.
o Indexes are created to:
 Allow quick access.
 Facilitate sorting or sorted order access.
 Insure no duplicates if the keyword UNIQUE is used when
defining an index.

Functional Dependency.

 FUNCTIONAL DEPENDENCY: A Functional Dependency (FD) is a


relationship between or among attributes, i.e. given a unique value for one or
more attributes, such as the CustomerAccountNumber, we can find a
corresponding value for another attribute such as
the CustomerBalance attribute.
 Equations represent functional dependencies. Consider the equation:

TotalPrice = ItemPrice x Quantity


 Here, TotalPrice is functionally dependent on
both ItemPrice and Quantity.
 Unlike equations, however, FDs cannot be worked out using arithmetic;
instead, they are listed in the database.
 FDs are written following standard notation. For example, if the
attribute A determines the attribute B, we write the notation:

A -> B (read A functionally determines B)

 The attribute "A" is called a determinant, here it is a determinant of B.

Rules of Functional Dependencies:

 If W, X, Y, and Z are attributes of a table, then:


o X -> X
 (reflexive rule - not terribly useful, simply means that if we
know X, then we know X).
o If X-> Y, then XZ ->Y
 (augmentation rule - Note Y is not really dependent on Z, but if
we know X and can determine a value for Y from X, then
knowing Z has no effect on our ability to determine a value for
Y).
o If X -> Y and X ->Z, then X -> YZ
 (union rule - useful for combining tables).
o If X -> Y then X -> Z if Z is a subset of Y
 (decomposition rule).
o If X -> Y and Y ->Z, then X -> Z
 (transitivity rule - useful for avoiding transitive dependencies).
o If X -> Y and YZ -> W, then XZ -> W
 (pseudotransitivity rule - useful in understanding multivalued
dependencies.

Well-Structured Tables (Relations).

 A well-structured table contains a minimum amount of redundancy and


allows users to insert, modify, and delete the rows in a table without errors
or inconsistencies.
 Examine the table below which is not well-structured. In this example, the
university is tracking which Employees enroll in University Courses.
Employees are listed who have taken (or are taking) Courses.
EmpId Name Dept Salary Course DateTook Fee
130 Margaret Math 45,000 Calculus 01/15 150
130 Margaret Math 45,000 Biology 02/15 200
200 Susan Sci 38,000 Biology 01/15 200
250 Chris Math 52,000 Calculus 03/15 150
250 Chris Math 52,000 Biology 03/15 200
425 Bill Math 48,000 Algebra 03/15 200
425 Bill Math 48,000 Calculus 04/15 150

Problems With This Table:

 Redundancy of data storage.


 Potential inconsistencies on updating data.

What is the Primary Key of this table?


To determine an appropriate key, you should first examine the FDs.
EmpId -> Name, Dept, Salary
Course -> Fee
EmpId, Course -> DateTook

 Assuming employees only take a course once (no time dependencies), then
uniqueness for the rows in the table is ensured by a composite key of EmpId
+ Course.

Data Anomalies.

 Data Anomalies are problems with data storage caused by poorly structured
tables. Refer to the table above.
 Insertion Anomaly. If the primary key is EmpId + Course, to add a new
employee, the employee must first be enrolled in a course. If an employee is
not enrolled in a course, then the COURSE column that is part of the
composite primary key will be null, and null key values are not allowed.
 Deletion Anomaly. Deleting data for Employee #425 (Bill) causes us to lose
data about Algebra and the course fee for Algebra because Bill is the only
employee who has enrolled in Algebra.
 Modification Anomaly. If the fee for Calculus is increased, the data must
be updated for more than one row.
 Note there is also a time-sensitivity between EmpId and Course since an
employee could take a course many times, but the table does not track this
fact.
First Normal Form (1NF). Remove Repeating Groups.

 Consider the following table where redundant data are eliminated from the
view:

EmpId Name Dept Salary Course DateTook Fee


130 Margaret Math 45,000 Calculus 01/15 150
Biology 02/15 200
200 Susan Sci 38,000 Biology 01/15 200
250 Chris Math 52,000 Calculus 03/15 150
Biology 03/15 200
425 Bill Math 48,000 Algebra 03/15 200
Calculus 04/15 150

 In order to store information for employees who take more than one course,
a possible table structure is:

EMPLOYEE (EmpId, Name, Dept, Salary, Course1, DateTook1, Fee1,


Course2, DateTook2, Fee2, …)

 Obviously there is a problem anticipating the number of times the group


(Course, DateTook, and Fee will repeat).
 A better table structure (allowing for the fact that there is significant data
redundancy) and a composite primary key of EmpId + Course, is:

EMPLOYEE (EmpId, Name, Dept, Salary, Course, DateTook, Fee)

EmpId Name Dept Salary Course DateTook Fee


130 Margaret Math 45,000 Calculus 01/15 150
130 Margaret Math 45,000 Biology 02/15 200
200 Susan Sci 38,000 Biology 01/15 200
250 Chris Math 52,000 Calculus 03/15 150
250 Chris Math 52,000 Biology 03/15 200
425 Bill Math 48,000 Algebra 03/15 200
425 Bill Math 48,000 Calculus 04/15 150

 An obvious problem associated with the above solution is the storage of a lot
of redundant data. We can eliminate this problem by further normalizing the
table.
Second Normal Form (2NF). Remove Partial Dependencies.

 There are many potential problems associated with the data redundancy.
Theseanomalies occur because not all attributes are fully dependent on the
primary key. Note that this type of problem only arises when the key is
a composite key.
 The FDs here are:

EmpId -> Name, Dept, Salary


Course -> Fee
EmpId, Course -> DateTook

 Solution: Divide the table into two or more tables. Do this by removing
attributes dependent on part of the key to a separate table with just that part
of the key as the primary key. This results in three tables for this particular
modeling problem:

EMPLOYEE (EmpId, Name, Dept, Salary )

EmpId Name Dept Salary


130 Margaret Math 45,000
200 Susan Sci 38,000
250 Chris Math 52,000
425 Bill Math 48,000

COURSE (Course, Fee )

Course Fee
Calculus 150
Biology 200
Algebra 200

TOOK_COURSE ( EmpId, Course, DateTook )

EmpId Course DateTook


130 Calculus 01/15
130 Biology 02/15
200 Biology 01/15
250 Calculus 03/15
250 Biology 03/15
425 Algebra 03/15
425 Calculus 04/15

 Note that DateTook should be included as part of the primary key if


employees can take a course more than once and the appropriate table and
primary key become:

TOOK_COURSE ( EmpId, Course, DateTook )

Third Normal Form (3NF). Remove Transitive Dependencies.

 A transitive dependency is a functional dependency between twoor more


non-key attributes where one of the attributes is a determinant, but not
part of the primary key and usually not a candidate key. Consider the
following table.

CustNo Name Salesperson Region


8023 Anderson Smith South
9167 Bancroft Hicks West
7924 Hobbs Smith South
6837 Tucker Hernandez East
8596 Arnold Hicks West

 The FDs here are:

CustNo -> Name, Salesperson, Region


Salesperson -> Region

 Note that Salesperson, a non-key attribute, determines Region, another non-


key attribute.
 Note Salespersons are assigned to regions and that Salesperson is
a candidate keysince it is a determinant of Region.
 Solution: Again the Table must be divided into two or more tables,
depending upon how many different transitive dependencies exist. Usually
only one exists in a table, but more than one is certainly possible.
SALES ( CustNo, Name, Salesperson )

CustNo Name Salesperson


8023 Anderson Smith
9167 Bancroft Hicks
7924 Hobbs Smith
6837 Tucker Hernandez
8596 Arnold Hicks

SPERSON (Salesperson, Region )

Salesperson Region
Smith South
Hicks West
Hernandez East

Boyce-Codd Normal Form (BCNF). Every Determinant Is A Candidate Key.


Resolve Anomalies For More Than One Candidate Key.

 If a table has more than one candidate key, data storage anomalies may
result even if the table is in 3NF.
 This situation arises when the candidate key is not identified as part of or
the primary key, and two of the candidate keys
have overlapping attributes. Consider the following table.

Player Position Coach


Earl Fullback Joe
John Guard Ed
Tony Fullback Pete
Earl Tackle Jim
Mack Fullback Joe

 Each player may play more than one position (note Earl).
 Each position can have more than one coach (note Fullback position).
 For each position a player is assigned only one coach.
 Each coach coaches only a single position.
 Each coach can coach several players who play a position.
 Anomalies: If John leaves the team, we lose the fact
that Ed coaches Guards. If we have Mel as a Quarterback coach, we
cannot insert this information into the table until a player is assigned to
play Quarterback because the Player column would be null and that is not
allowed for a column that is part of the primary key.
 No single attribute is a candidate key (determines the other two attributes).
 FDs for the table are:

Player, Position -> Coach


Player, Coach -> Position (by the augmentation rule).
Coach -> Position

 Confirm for yourself that the table is in 3NF.


 Solution: Divide the table into two tables. Place the attribute that is
a determinant, but not a primary key (Coach) in a separate table with the
attributes that it determines and make it the key.
 Here this refers to the COACH table since the attribute Coach is a
determinant, but not a candidate key.

POSITION ( PLAYER, COACH )

Player Coach
Earl Joe
John Ed
Tony Pete
Earl Jim
Mack Joe

COACH (COACH, POSITION )

Coach Position
Joe FB
Ed G
Pete FB
Jim T

Fourth Normal Form (4NF). Remove Multivalued Dependencies.


 Once BCNF has been achieved, there are no additional anomalies that can
result from functional dependency problems. However, problems can arise
from multi-valueddependencies.
 In order for this type of problem to occur, there must be at least three
attributes in a table with the following associations:

A ->> B (read A is associated with multiple values of B)


A ->> C (read A is associated with multiple values of C)
B <-/-> C (read B and C are independent of one another)

 Consider the situation where each course can have several instructors and
eachcourse uses several textbooks, but the instructors do not select
the textbooks (they are independent of one another).

Course ->> Instructor


Course ->> Textbook
Instructor <-/-> Textbook

 In the following table the primary key is a composite of all three


attributes: Course + Instructor + Textbook.
 Note that this represents in an E-R diagram model the intersection table for
a ternaryrelationship.
 The 4NF problem arises when the data should have been modeled as a set
of two binary relationships instead of a ternary relationship.

OFFERING (Course, Instructor, Textbook) -- Note the composite key -- this


table structure is WRONG!!!!

Course Instructor Textbook


Mngt White Drucker
Mngt Green Drucker
Finance Gray Weston
Finance Gray Gilford
Mngt White Peters (new row)
Mngt Green Peters (new row)

 If a new textbook (one written by Peters) is adopted for use in


the Mngt course by all instructors (department mandates usage of the book),
then it is necessary to addmultiple rows to this table.
 The new rows would be those indicated above as new and it is the problem
of having to add multiple rows that is the anomaly here.
 Solution: Divide the table into two new tables:

INSTRUCTOR(Course, Instructor)

Course Instructor
Mngt White
Mngt Green
Finance Gray

TEXT(Course, Textbook)

Course Textbook
Mngt Drucker
Finance Weston
Finance Gilford
Mngt Peters

 When information about the Peters book for the Mngt course is added, only
a single row is added to the TEXT table.

Fifth Normal Form (5NF). Join Dependency.

 This problem arises if a table is divided into two or more tables, but it is not
possible to join the tables together again to create the original view of the
data without inaccurate join results.
 This problem arises in the case of ternary relationships and is covered in
detail with examples in my article, "Entity Relationship Modeling and
Normalization Errors."
 Basically, this occurs when a relationship that must be modeled as
a ternary relationship is incorrectly modeled as a set of two or
more binary relationships. This is essentially the opposite problem that
arises with 4NF.
 We will study this problem in more detail by reading the article on
normalization errors.

You might also like