3 Relational Data Model
3 Relational Data Model
System
[CoEg3193]
Chapter Three:
Relational Data Model
Outline
• Structure of Relational Database
• Conceptual Data Model to Relational Model
Mapping
• Dependencies
– Functional Dependencies
– Multivalued Dependencies
• Normal Forms and Normalization
2
Cont’d
• Relational Data Model is an implementation
(representational) model proposed by E.F.
Codd in 1970.
• The model is an approach in a database design
towards the Relational Database Management
System (RDBMS).
3
Structure of Relational Database
• The main construct for representing data in
the relational database is a two-dimensional
table called a relation.
– The columns in the table are representing the
attributes of the relationship, and
– the rows (other than the heading row) represent
tuples (records) of the relation.
4
Example:-“EMPLOYEES” relation
6
Example
• Employees (EmpId:sting, Name:string,
BDate:date, SubCity:string, Kebel:integer,
Phone:string)
• Projects (PrjId:integer, Name:string,
SDate:date, DDate:date, CDate:date)
• Teams (Name:string, Descr:string)
7
Properties of Relations
• Rows (tuples) in a single relation are unique (that is;
no two tuples are identical).
• Relations are set of tuples, not lists (that is; order of
tuples in a relation is immaterial).
• Attributes are atomic.
• The values that appear in a column must be drawn
from the domain associated with that column.
• The degree, also called arity, of a relation is the
number of attributes in the relation.
• The relation names in a relational database are
distinct. 8
Key Constraints
• A key constraint is a statement that a certain
minimal subset of the attributes of a relation
is a unique identifier for a tuple in the relation.
• A set of attributes that uniquely identifies a
tuple according to a key constraint is called a
candidate key for the relation; often
abbreviated just as key.
• Key attributes in relational model are
indicated by underlying the attributes in the
relational. 9
Cont’d
• Examples:
– Employees (EmpId, Name, BDate, SubCity, Kebel,
Phone)
– Projects (PrjId, Name, SDate, DDate, CDate)
– Teams (Name, Descr)
• REMARK: Note that a key for a relation may
not be directly inferred from the high-level
conceptual models in some cases.
10
Foreign Key Constraints
• The most common integrity constraint
involving two relations is a foreign key
constraint. It keeps data consistency when a
data modification is done on a relation.
• The foreign key in the referencing relation
requires a match to a primary key in the
referenced relation. That is, there must be a
compatible data type attribute in the
referenced relation so as the referencing
relation may make the referencing. 11
Cont’d
• Example
– Employees (EmpId, Name, BDate, SubCity, Kebel,
Phone)
– WorkSchedule (SDate, EDate, HoursPerDay,
Employee)
• The “WorkSchedule” to refer to the “Employees”
relation instance, it has an attribute “Employee” of
the same type as the “EmpId” in the “Employees”
relation which is a primary key.
• The foreign key constraint is implemented through
the “Employee” attribute in the referencing relation
“WorkSchedule”. 12
Cont’d
13
Cont’d
• Note:-
– A single tuple can be referenced by zero or more
tuples in the referencing relation, but a single
tuple with a single foreign key attribute can only
reference one tuple.
– A foreign key could refer to the same relation.
– A relational database consists of related relations
through a foreign key.
14
Conceptual Data Model to
Relational Model Mapping
• The second phase in database design is
implementation design that
– transforms the conceptual data model into
an internal model - schema such as a
relational data model for an implementation
into relational database management
system (RDBMS).
15
From E/R Diagram to
Relational Model
16
Cont’d
• E/R diagram’s entity sets and relationship are
ways of describing a relational schema and
the sets of entities and relationship sets form
the relational instance of the E/R schema
which is not part of the database design.
17
Entity Sets to Relations
• Strong entity sets in E/R model are mapped
to relations in relational model with the same
name and attributes.
• The primary keys assigned for the entity sets
are also represented as keys in the relations.
18
Example
20
Handling Weak Entity Sets
• Suppose
– W is a weak entity set with attribute set {a1, a2, a3, … an}
and identifying strong entity set E.
– And let the primary key of E is the set {b1, b2, … bm},
• then the attributes of the relation for the
weak entity set
– must include attributes for its complete key
(including those belonging to the identifying
strong entity set) and its own, non-key attributes.
– That is, the set of attributes of the mapping
relation is
{a1, a2, a3, … an} U {b1, b2, … bm}. 21
Cont’d
• The primary key for the weak entity set
relation thus include:
– The discriminator (attributes) of the weak entity
set, and
– The primary key of the identifying strong entity
set.
• Example:- For the weak entity set (TEAMS)
in figure above the corresponding relation is:
– Teams(ProjId, Name, Descr)
22
Handling Composite and
Multivalued Attributes
• Composite attributes from E/R model to a relational model
can be represented by creating separate attributes for each
of the components of the attributes (Note that the composite
attribute is not mapped directly into a separate attribute).
• Multivalued attributes are handled by creating relations
with the name of the attribute having attributes that
corresponds to the components of the multivalued attribute
and the primary key of the entity set or relationship set of
which the attribute belongs. The primary key for the newly
created relation consists of:
– The primary key of the entity set or relationship set, and
– The attribute or set of attributes from the multivalued
attribute. 23
Cont’d
• Example:- Consider the EMPLOYEES entity
set in figure above the corresponding relations
for the entity set are:
– Employees(EmpId, Name, BDate, Age)
– Addresses(City, SubCity, Kebele, HNo, Phone1,
Phone2, EmpId)
• REMARK:-Note that; if the multivalued attribute
has a fixed size of multiplicity (small size), it can be
represented by separate attributes for each
multiplicity. For example consider phone attribute
above. 24
Relationship Sets to Relations
• Suppose entity set E with a primary key {a11,
a12, a13, … a1n} is related to an entity set F with
a primary key {a21, a22, a23, … a2m} through a
relationship R. Let the relationship R has a
descriptive attribute set {b1, b2, b3, … bp}, then
the relationship is represented by a relation
whose attributes are:
– The keys of the connected entity sets: {a , a , a ,
11 12 13
… a } U {a , a , a , … a }, and
1n 21 22 23 2m
27
Cont’d
• Suppose entity set E and F are related
through a many-to-one relationship R from
E to F, then it is possible to join the relations
for E and R that come out of this E/R model
into a single relation S with a schema
consisting of:
– All attributes of the entity set E,
– The keys attributes of the entity set F, and
– All Attributes of the relationship R.
28
Cont’d
• If the participation of E into R is total it is
also possible to include all attributes of F in
the relation S and have one single relation S in
place of the three relations E, F and R.
• The primary key for S would the primary key
of E.
29
Cont’d
• Example:-Consider the entity sets
“PROJECTS” and “CUSTOMERS” and the
corresponding relationship “Owns”, then we
can have:
– Projects(ProjId, Name, SDate, DDate, CustId)
and Customers(CustId, Name, Address), or
– Projects(ProjId, Name, SDate, DDate, CustId,
Name, Address
30
Representation of Generalization
and Specialization (Subclass)
• Hierarchical structure (Specialization and
Generalization or Inheritance) in relation
model can be represented in three different
ways:
1. E/R Style
2. Use of Nulls
3. Object-Oriented Approach
31
Cont’d
1. E/R Style: One relation for each lower-level entity
set and the higher-level entity set. Every relation of
the lower-level entity set will include:
– Key attribute(s) of the higher-level entity set which forms
the primary key of the entity set, and
– Attributes of that lower-level entity set.
• For total and disjoint generalization the higher-level
entity set may not be mapped into a relation instead
all its attributes are passed to all immediate lower-
level entity sets realtions.
32
Cont’d
2. Use of Nulls : One relation having a large set of attributes
of all the lower-level entity sets and higher-level entity set;
entities have NULL in attributes that don’t belong to them.
• Involves large number of NULL values for disjoint
generalization.
3. Object-Oriented Approach: One relation per subset of
subclasses, with all relevant attributes including:
– Attributes of the higher-level entity set, and
– Attributes of that lower-level entity set.
• The primary key of the higher-level entity set becomes the
primary key of each relation.
33
Cont’d
• Example:- Consider the entity sets
“EMPLOYEES” and its lower-level entity
sets, then
– FullTimeEmployees(EmpId, Salary, Saving,
Allowance)
– PartTimeEmployees(EmpId, HourlyPay,
ContractPeriod)
34
From ODL Model to Relational
Model
35
Cont’d
• ODL model to relational model mapping is
similar to the mapping process of E/R model
to relational.
36
ODL Class to Relations
• ODL classes are directly mapped to relations
in relational model with their attribute as in
the case of E/R. Unlike E/R entity sets ODL
classes optionally define key, in such events it
is necessary to add new attribute as a primary
key for the relation.
• Non-atomic attributes in ODL classes are
represented by simply expanding the
structure definition and making one attribute
for each field of the structure. 37
• Example:- Consider the “EMPLOEES” class
partial declaration in chapter 2, the
corresponding relational model is:
class Employee (extent Employees key EmpId, NationalId, (Name, BDate)){
attribute string empId;
attribute string name;
attribute integer age;
attribute enum Gender {Male, Female} gender;
attribute struct Address {string city, string hAddr, string phone} address;
:
};
39
Cont’d
• Example:- Consider the “EMPLOEES” class
having a set of addresses:
class Employee (extent Employees key EmpId, NationalId, (Name, BDate)){
.
.
attribute struct set Address {string city, string hAddr, string phone} address;
:
};
41
Cont’d
2. By separating out each set-valued attribute
into a new relation and establishing a many-
to-many relationship.
3. By having multiple attributes sets for each
set-valued attribute. This is applicable only if
the type constructor is fixed size array.
42
ODL Relationships to Relations
• Recall that ODL relationships are represented
in pair as inverse relationship, hence in
representing ODL relationships only one of
the declarations are used.
• Similar to E/R relationships, ODL
relationships can also be represented by a
relation having the primary keys of the
related classes as attributes.
43
• Example:- Consider the relationship between
“EMPLOYEES” class and the “TEAMS”
class.
class Employee {
relationship Set <Team> assigned
inverse Team::formed;
};
class Team {
relationship Set <Employee> formed;
inverse Employee::assigned;
};
45
• Example:- Consider the one-to-many
relationship between “PROJECTS” and
“CUSTOMERS”
class Project {
relationship <Customer> ownedBy
inverse Customers::owns;
};
class Customer {
relationship Set <Project> owns;
inverse Projects::ownedBy;
};
47
Functional Dependencies
• Functional dependency is a kind of constraint that helps to
remove redundancy in relational database design.
• Definition:
– Functional dependency denoted by X A is an assertion
about a relation R that whenever two tuples of R agree on
all the attributes of X, then they must also agree on the
attribute A. We say that “X A holds in R” or “X
functional determines A”
– Note that in the notation X A; X represent sets of
attributes and A represent single attribute. That is
A1A2A3…AnB
• The functional dependency is a generalization of the notion
of superkey. 48
Example:
• Consider the Teams relation:
– Teams(PrjId, Name, Descr), then
– PrjId, Name Descr
• For the Employees relation:
– Employees(EmpId, NationalId, Name, BDate, Age,
Gender, City, HAddr, Phone)
– EmpId Name; EmpId Age; Name BDate
Gender
• A functional dependency A1A2A3…An B is
said to be trivial dependency if B is an
element or of {A1, A2, A3…An}. 49
Rules of Functional Dependency
• Combining Rule:
– The functional dependencies:
A1A2A3…An B1
A1A2A3…An B2
A1A2A3…An B3
.
.
A1A2A3…An Bm
Can be written as
A1A2A3…An B1 B2B3…Bm
50
…Rules of Functional
Dependency
• Splitting Rule:
– The functional dependencies:
A1A2A3…An B1 B2B3…Bm
Can be written as
A1A2A3…An Bi for i=1,2,3…m
51
…Rules of Functional
Dependency
• Closure of Attributes
– Suppose {A1, A2, …An} is a set of attributes and
S is a set of functional dependencies in relation R.
The closure of the set {A1, A2, …An} under the
functional dependency set S is the set of
attributes B that are functionally determined from
the set S. That is; A1A2A3…An B follows from
the set S. The closure set of attributes A1, A2,
…An is denoted by {A1, A2, …An}+
– The closure set of attributes can be determined
by repeatedly applying the following three rules
known as Armstrong’s Axioms. 52
…Rules of Functional
Dependency
• Reflexivity Rule:
– If α is set of attributes and β C α then, α β
holds.
• Augmentation Rule:
– If α β holds and γ is set of attributes, then γα
γβ holds.
• Transitivity Rule:
– If α β holds and β γ holds, then α γ holds.
53
…Rules of Functional
Dependency
• Algorithm for computing the closure of X, X+ is given below.
1. Let X be a set of attributes that eventually will become the
closure. First, we initialize X to be X.
2. Now, we repeatedly search for some functional dependency
B1B2…Bn C Such that all of B1, B2...Bm are in the set of
attributes X but C is not. We then add C to the set X.
3. Repeat step 2 as many times as necessary until no more
attributes can be added to X.
4. The set X, after no more attributes can be added to it, is the
closure set X+.
54
Example:
55
Exercise: Test whether D A flows
from the functional dependency set?
• To test for D A, first determine the closure
set of {D}
• X = {D}
• From the function dependency D E, we add
E to X that is X = {D, E}
• No more changes in X are possible. Thus
{D}+ = {D, E}
• From the closure set D A does not hold.
56
Multivalued Dependencies
• Multivalued dependency for a relation R, is
defined as a constraint when the values of one
set of attributes is fixed, then the values in
certain other attributes are independent of
values of all the other attributes in R.
57
…Multivalued Dependencies
• That is; for a multivalued dependency XY
in R where X and Y are subsets of the set of
attributes in R, if t and u are tuples in the
relational instance r for the schema R that
agree on all the X's, then there exist a third
tuple v that agrees:
1. with both t and u on X’s,
2. with t on Y’s, and
3. with u on all attributes of R that are not among
X’s or Y’s (R – (X U Y)). 58
…Multivalued Dependencies
X Y
59
Rules of Multivalued Dependency
• Multivalued dependency is a generalization
for the functional dependency. That is;
– If α β holds, then αβ also holds.
• All the rules except the splitting rule for the
functional dependency are also applicable for a
multivalued dependency.
60
…Rules of Multivalued
Dependency
• Complementation Rule:
– One additional rule in a multivalued dependency
that does not have a counterpart in functional
dependency is the complementation rule.
– The rule states that if X Y holds then
X(R – (X U Y)), where R is a set of attributes
for the relational schema R.
61
Normal Forms and Normalization
• In relational databases, normalization is a
process that helps to
– eliminates redundancy,
– organizes data efficiently,
– reduces the potential for anomalies during data
operations, and
– improves data consistency.
• The formal classifications used for quantifying
"how normalized" a relational database are
called normal forms (abbreviated as NF). 62
Normalization
• Following standard database normalization
recommendations when designing databases can
greatly maximize a database's performance by
helping to:
– Reduce the total amount of redundant data in
the database. The less data, the less work on the
RDBMS has to perform, hence, speeding its
performance.
– Reduce the use of NULLS in the database. The
use of NULLs in a database can greatly reduce
database performance, especially in WHERE
63
clauses.
…Normalization
– Reduce the number of columns in tables. The
less number of columns in tables, the more rows
can fit on a single data page, which helps to boost
read performance of the RDBMS.
– Reduce the amount of SQL code. The less code
there is, the less that has to run, speeding your
application's performance.
– Maximize the use of clustered indexes. The
more data is separated into multiple tables
because of normalization, the more clustered
indexes become available to help speed up data
64
access.
…Normalization
– Reduce the total number of indexes. The less
columns tables have, the less need there is for
multiple indexes to retrieve it. And the fewer
indexes, the less negative is the performance effect
of data insertion, modification and deletion.
65
Cont’d
• Redundancy in a database design results in
data anomalies classified as:
– Insertion Anomalies
– Deletion Anomalies
– Modification Anomalies
66
Example:
• Consider a relation schemas for Employees
and Teams in a single realtion as follows
– Emp_Teams(EmpId, Name, BDate, Gender,
TeamId, Project, TeamName)
• It can easily be noted that there is redundancy
of data in the “Emp_Teams” relation for the
Teams detail of the Employees.
• Consider the following instance for the
relations
67
…Example:
68
…Example:
• Insertion Anomalies: Suppose we want to insert a
new employee that works in project 1 as a
programmer, then the corresponding fields for the
Team detail has to be entered correctly. If data is
entered incorrectly the consistency will be violated.
• Deletion Anomalies: Suppose E003 is to be
removed from the employees list, then Team
information of TeamId 5 will also be removed and
vice versa.
• Modification Anomalies: During data update the
consistency may also be violated as in the case of
69
insertion.
Denormalization
• Although normalization is a way to remove
redundancy anomalies and preserve
consistency, integrity and maintainability, it
may also lead:
– Increase in storage space
– Complex queries (queries with many multiple
joins of tables)
• In such situations it may be desired to
denormalize some of the tables in order to
reduce storage space and the number of
70
required joins.
…Denormalization
• Denormalization is the process of selectively
taking normalized tables and re-combining
the data in them. Sometimes the addition of a
single column of redundant data to a table
from another table can reduce a 4-way join
into a 2-way join, significantly boosting
performance by reducing the time it takes to
perform the join.
71
…Denormalization
• Databases intended for Online Transaction
Processing (OLTP) are normalized. By
contrast, databases intended for On Line
Analytical Processing (OLAP) operations are
primarily "read only“ databases and tend to
extract historical data that has accumulated in
the project for quite a long time. For such
databases, redundant or "denormalized" data
may facilitate Business Intelligence
applications.
72
…Denormalization
• While denormalization can boost storage and query
performance, it can also have negative effects.
• For example, by adding redundant data to tables, you
risk the following problems:
– More data means the RDBMS has to read more
data pages than otherwise needed, hurting
performance.
– Redundant data can lead to data anomalies and
bad data.
– In many cases, extra code will have to be written
to keep redundant data in separate tables in synch,73
which adds to database overhead.
Normal Forms
• Normalization procedure provides:
– A framework for analyzing relation
schemas based on functional and
multivalued dependencies.
– A series of normal form test that can be
carried out on individual relation schemas
so that the relational database can be
normalized to any degree.
74
…Normal Forms
• Normalization through decomposition need to
preserve the existence of two additional
properties of a relational schema:
– Lossless or Nonadditive Join: Nonadditive join
property guarantees that the spurious tuple
generation does not occur after decomposition
– Dependency Preservation: Dependency
preservation ensures that each functional
dependency is presented in one of the individual
relation resulting after decomposition.
75
First Normal Form (1NF)
– A relation (table) R is in 1NF if and only if all
underlying domains of attributes contain only
atomic (simple, indivisible) values, i.e. the value
of any attribute in a tuple (row) must be a single
value from the domain of their attribute.
• 1NF allows removal of multivalued attributes,
composite attributes and their combination in
the relational schema.
• Normalization (Decomposition)
– Form new relation for each non-atomic attribute
76
or nested relation.
1st Normal Form
First Normal Form is violated if:
Employees Table
EMPID LNAME FNAME SEX DEPT PHONE SALARY LANGUAGES
23 Jones Mark M ITR 555-1087 45000 COBOL, JAVA, SQL
25 Smith Sara F FINC 555-2222 55000
26 Billings David M ACTG 555-4356 42000
31 Dance Ivanna F ACTG 444-4887 60000 SQL
32 Jones Mary F ITR 555-8745 70000 JAVA, SQL, VB, COBOL
35 Barker Bob M ACTG 555-6565 44000
36 Woods Robin M ITR 555-9812 90000 VB, SQL, JAVA
37 Jones Mary F FINC 555-1234 56000 COBOL, SQL
Languages Table
NAME FULLNAME
COBOL COmmon Business Oriented Language
JAVA JAVA
SQL Structured Query Language
VB Visual Basic
1NF Example – Schema 3 (incorrect)
Employees Table
EMPID LNAME FNAME SEX DEPT PHONE SALARY LANG1 LANG2 LANG3 LANG4
23 Jones Mark M ITR 555-1087 45000 COBOL JAVA SQL
25 Smith Sara F FINC 555-2222 55000
26 Billings David M ACTG 555-4356 42000
31 Dance Ivanna F ACTG 444-4887 60000 SQL
32 Jones Mary F ITR 555-8745 70000 JAVA SQL VB COBOL
35 Barker Bob M ACTG 555-6565 44000
36 Woods Robin M ITR 555-9812 90000 VB SQL JAVA
37 Jones Mary F FINC 555-1234 56000 COBOL SQL
Languages Table
NAME FULLNAME
COBOL COmmon Business Oriented Language
JAVA JAVA
SQL Structured Query Language
VB Visual Basic
1NF Example – Schema 4 (incorrect)
Employees Table
EMPID LNAME FNAME SEX DEPT PHONE SALARY COBOL JAVA SQL VB
23 Jones Mark M ITR 555-1087 45000 T T T F
25 Smith Sara F FINC 555-2222 55000 F F F F
26 Billings David M ACTG 555-4356 42000 F F F F
31 Dance Ivanna F ACTG 444-4887 60000 F F T F
32 Jones Mary F ITR 555-8745 70000 T T T T
35 Barker Bob M ACTG 555-6565 44000 F F F F
36 Woods Robin M ITR 555-9812 90000 F T T T
37 Jones Mary F FINC 555-1234 56000 T F T F
Languages Table
NAME FULLNAME
COBOL COmmon Business Oriented Language
JAVA JAVA
SQL Structured Query Language
VB Visual Basic
Second Normal Form (2NF)
• A relation schema R is in 2NF if it is in 1NF and
every non-prime attribute A in R is fully functionally
dependent on the primary key. (i.e. not partially
dependent on candidate key).
• Functional dependency X Y is said to be fully
functionally dependent if removal of any attribute
from X result in for the dependency not hold.
• NOTE: Mostly relational schemas that are mapped
carefully from E/R and ODL model are in 2NF.
82
…Second Normal Form (2NF)
• Normalization (Decomposition)
– Decompose and set up a new relation for each
partial key with its dependent attribute(s). Make
sure to keep relation with the original primary
key and any attributes that are fully functionally
dependent on it.
83
…Second Normal Form (2NF)
• Example: Consider a relation schemas for
Employees and Teams in a single realtion as follows
– Emp_Teams(EmpId, Name, BDate, Gender, TeamId,
Project, TeamName)
– EmpId Name, BDate, Gender
– TeamId Project, TeamName
• Then upon decomposition we will have
– Employees(EmpId, Name, BDate, Gender)
– Teams(TeamId, Project, TeamName)
– Emp_Teams(EmpId, TeamId)
84
2nd Normal Form
Second Normal Form is violated if:
Transactions Table
JENO LINENO DATE DESCRIPTION ACCTNO ACCTNAME AMOUNT
1 1 02-JAN-2003 Owner investment 100 Cash 20,000
1 2 02-JAN-2003 Owner investment 310 Smith-Capital (20,000)
2 1 03-JAN-2003 Borrowed money 100 Cash 30,000
2 2 03-JAN-2003 Borrowed money 220 Notes Payable (30,000)
3 1 03-JAN-2003 Purchased Supplies 120 Supplies 5,000
3 2 03-JAN-2003 Purchased Supplies 100 Cash (1,000)
3 3 03-JAN-2003 Purchased Supplies 220 Notes Payable (4,000)
Journal_Entry Table
JENO DATE DESCRIPTION
1 02-JAN-2003 Owner investment
2 03-JAN-2003 Borrowed money
3 03-JAN-2003 Purchased Supplies
Transactions Table
JENO LINENO ACCTNO ACCTNAME AMOUNT
1 1 100 Cash 20,000
1 2 310 Smith-Capital (20,000)
2 1 100 Cash 30,000
2 2 220 Notes Payable (30,000)
3 1 120 Supplies 5,000
3 2 100 Cash (1,000)
3 3 220 Notes Payable (4,000)
Third Normal Form (3NF)
• 3NF for a relation schema R requires that the
R be in 2NF, and that there would be no
nonprime attribute of R that has transitive
dependencies on the primary key. In summary,
all non-key attributes are mutually
independent. Thus, any relation in which all
the attributes are prime attributes (part of
some key) is guaranteed to be in at least 3NF.
89
…Third Normal Form (3NF)
• That is; if X Y is non-trivial functional
dependency in R, then
– X is superkey for schema R, or
– Attribute Y is a member of a candidate key (prime
attribute).
• Normalization (Decomposition)
– Decompose and set up a relation that includes the
non-key attribute(s) that functionally determine(s)
other non-key attributes.
90
…Third Normal Form (3NF)
• Boyce-Codd Normal Form (BCNF)
– BCNF requires that there will be no non-trivial
functional dependencies of attributes on
something other than a superset of a candidate
key (called a superkey). At this stage, all attributes
are dependent on a key, a whole key and nothing
but a key (excluding trivial dependencies).
– A table is said to be in the BCNF if and only if it
is in the 3NF and every non-trivial, left-
irreducible functional dependency has a candidate
key as its determinant. In more informal terms, a
table is in BCNF if it is in 3NF and the only
determinants are the candidate keys. 91
…Third Normal Form (3NF)
– That is; if X Y is non-trivial functional
dependency in R, then
• X is superkey for schema R.
– Note that major goals of database design with
functional dependencies are:
• BCNF,
• Lossless join, and
• Dependency preservation;
– However; in certain situations it is needed to
compromise BCNF need with 3NF to preserve
dependency. 92
…Third Normal Form (3NF)
• Example:
– A relation that is in 3NF form but not in BCNF:
– R(A, B, C, D) and F = {ABCD, BCAD, AC}
• AB and BC are candidate keys, thus
• AC will not violet 3NF where as it violets BCNF
since A is not superkey.
– A relation that is in 3NF form and in BCNF:
• R(A,B) is guaranteed to be in BCNF since its only
possible functional dependencies are AB, BA
and/or the trivial ABAB.
93
…Third Normal Form (3NF)
• Example: Consider the Project relation from
the E/R model
– Projects(ProjId, Name, SDate, DDate, CustId,
Name, Address)
• ProjId Name, SDate, DDate, CustId, Name, Address
• CustId Name, Address
– Then upon decomposition we will have
• Projects(ProjId, Name, SDate, DDate, CustId)
• Customers(CustId, Name, Address)
94
3rd Normal Form
Third Normal Form is violated if:
Transactions Table
JENO LINENO ACCTNO ACCTNAME AMOUNT
1 1 100 Cash 20,000
Are there any 1 2 310 Smith-Capital (20,000)
redundant facts?
2 1 100 Cash 30,000
2 2 220 Notes Payable (30,000)
3 1 120 Supplies 5,000
3 2 100 Cash (1,000)
3 3 220 Notes Payable (4,000)
3NF Example – Violation
FD that indicates violation of 3NF
Journal_Entry Table
Anomalies if not corrected:
JENO DATE DESCRIPTION
• update (if name of 1 02-JAN-2003 Owner investment
account 100 changes it 2 03-JAN-2003 Borrowed money
must be changed in 3 03-JAN-2003 Purchased Supplies
multiple places risking
inconsistancy)
• deletion (can't delete
JE#3 and its transactions
without losing information JENO LINENO ACCTNO ACCTNAME AMOUNT
about account 120) 1 1 100 Cash 20,000
• insertion (can't set up a 1 2 310 Smith-Capital (20,000)
new account, Jones- 2 1 100 Cash 30,000
capital, for a new partner 2 2 220 Notes Payable (30,000)
unless we first have a 3 1 120 Supplies 5,000
transaction involving that 3 2 100 Cash (1,000)
account.
3 3 220 Notes Payable (4,000)
3NF Example – Corrected
Journal_Entry Table
Accounts Table JENO DATE DESCRIPTION
ACCTNO ACCTNAME 1 02-JAN-2003 Owner investment
100 Cash 2 03-JAN-2003 Borrowed money
120 Supplies 3 03-JAN-2003 Purchased Supplies
220 Notes Payable
310 Smith-Capital
Transactions Table
JENO LINENO ACCTNO AMOUNT
1 1 100 20,000
1 2 310 (20,000)
2 1 100 30,000
2 2 220 (30,000)
3 1 120 5,000
3 2 100 (1,000)
3 3 220 (4,000)
3NF Example – Corrected
Final Dependencies
106
Fifth Normal Form (5NF or
PJNF)
107
…Fifth Normal Form (5NF or
PJNF)
• 5NF also known as Project-Join Normal Form
(PJNF) requires that there are no non-trivial
join dependencies that do not follow from the
key constraints. A table is said to be in the
5NF if and only if it is in 4NF and every join
dependency in it is implied by the candidate
keys.
• That is; if JD(R1, R2, … Rn) is non-trivial
join dependency in R, then
– Every Ri is superkey of R. 108
…Fifth Normal Form (5NF or
PJNF)
• A join dependency (JD), denoted by JD(R1,
R2, … Rn), specified on relational schema R,
specifies a constraint on the state r of R. The
constraint states that every legal state r of R
have a nonadditive join decomposition into
R1, R2, … Rn.
• Although, there are also other higher level
normalizations such as DKNF and 6NF, most
relational database designs are sufficiently
normalized at BCNF level or even at 3NF. 109
4th Normal Form
4th Normal Form is violated if:
Instruments_Languages
Name Instrument Language
Fred Piano French
Fred Flute Italian
Fred Flute Spanish
Jane Piano French
Jane Oboe French
Sam Piano French
Sam Oboe Spanish
Sam Flute Spanish
4NF Example – Violation
LanguagesSpoken InstrumentsPlayed
Name Language Name Instrument
Fred French Fred Piano
Fred Italian Fred Flute
Fred Spanish Jane Piano
Jane French Jane Oboe
Sam French Sam Piano
Sam Spanish Sam Oboe
Sam Flute
-End of chapter three-
114