CH - 3 Fundamentals of A Database System
CH - 3 Fundamentals of A Database System
Employees
EmpId Name BDate Sub City Kebele Phone
E001 Alemu Girma 01/10/70 Bole 06 011-663-0712
E004 Kelem Belete 12/04/68 Gulele 03 011-227-2525
Fig 1. Typical Employee relation instance
The columns in the table are representing the attributes of the relationship, and the rows (other
than the heading row) represent tuples (records) of the relation.
A relation in a relational model consists of:
The Relation schema: - that describes the column heads for the table and
The Relation instance: - that is the table with the set of tuples.
The set of relation schema forms schema for the relational database called database schema
(relational database schema).
In relational model the relation schema are described first. And the schema specifies
- The relation's name
- Name for each attribute (field or column)
- Domain of each attribute: - A domain is referred to in a relation schema by the domain
name and has a set of associated values.
Example
- Employees (EmpId:sting, Name:string, BDate:date, SubCity:string, Kebel:integer, Phone:string)
REMARK: Note that a key for a relation may not be directly inferred from the high-level
conceptual models in some cases.
In the above example for the “WorkSchedule” to refer to the “Employees” relation instance, it
has an attribute ‘Employee’ of the same type as the ‘EmpId’ in the “Employees” relation which is
a primary key. The foreign key constraint is implemented through the ‘Employee’ attribute in
the referencing relation “WorkSchedule”.
WorkSchedule Employees
… Hours Employee EmpId Name …
… 8 E001 E001 Alemu Girma …
… 6 E004 E004 Kelem Belete …
… 8 E002 E002 Mulken Getu …
… 4 E004
(a) Referencing relation (b) Referenced relation
Fig 2. Foreign Constraint in Relational Model
NOTE: - A single tuple can be referenced by zero or more tuples in the referencing relation, but
a single tuple with a single foreign key attribute can only reference one tuple.
- A foreign key could refer to the same relation.
- A relational database consists of related relations through a foreign key.
Then the relations from the strong entity sets having only simple and single valued
attributes are as follows
- Projects(ProjId, Name, SDate, DDate)
- Customers(CustId, Name, Address)
D epe ndencies
In a database design the two most common pitfalls that result in bad designing are:
Repetition of information, and
Inability to present certain information (Loss of information).
Functional Dependencies
Functional dependency is a kind of constraint that helps to remove redundancy in relational
database design.
A1 A2 A3…An B1
A1 A2 A3…An B2
:
A1 A2 A3…An Bm
can be written as:
A1 A2 A3…An B1 B2 … Bm
Splitting Rule
Multivalued Dependencies
Multivalued dependency for a relation R, is defined as a constraint when the values of one set of
attributes is fixed, then the values in certain other attributes are independent of values of all the
other attributes in R.
That is; for a multivalued dependency X Y in R where X and Y are subsets of the set of
attributes in R, if t and u are tuples in the relational instance r for the schema R, then there exist
a third tuple v that agrees:
1. with both t and u on X’s,
2. with t on Y’s, and
3. with u on all attributes of R that are not among X’s or Y’s (R – (X U Y)).
The rule states that if X Y holds then X (R – (X U Y)), where R is a set of attributes for
the relational schema R.
- Insertion Anomalies: Suppose we want to insert a new employee that works in project 1
as a programmer, then the corresponding fields for the Team
detail has to be entered correctly. If data is entered incorrectly
the consistency will be violated.
- Deletion Anomalies: Suppose E003 is to be removed from the employees list, then
Team information of TeamId 5 will also be removed and vice
versa.
- Modification Anomalies: During data update the consistency may also be violated as in
the case of insertion.
Although normalization is a way to remove redundancy anomalies and preserve consistency,
integrity and maintainability, it may also lead:
Increase in storage space
Complex queries (queries with many multiple joins of tables)
In such situations it may be desired to denormalize some of the tables in order to reduce storage
space and the number of required joins.
Denormalization is the process of selectively taking normalized tables and re-combining the data
in them. Sometimes the addition of a single column of redundant data to a table from another
table can reduce a 4-way join into a 2-way join, significantly boosting performance by reducing
the time it takes to perform the join.
Databases intended for Online Transaction Processing (OLTP) are normalized. By contrast,
databases intended for On Line Analytical Processing (OLAP) operations are primarily "read
only" databases and tend to extract historical data that has accumulated in the project for quite a
long time. For such databases, redundant or "denormalized" data may facilitate Business
Intelligence applications.
While denormalization can boost storage and query performance, it can also have negative
effects. For example, by adding redundant data to tables, you risk the following problems:
More data means the RDBMS has to read more data pages than otherwise needed,
hurting performance.
Redundant data can lead to data anomalies and bad data.
In many cases, extra code will have to be written to keep redundant data in separate
tables in synch, which adds to database overhead.
Normal Forms
Normalization procedure provides:
A framework for analyzing relation schemas based on functional and multivalued
dependencies.
A series of normal form test that can be carried out on individual relation schemas so
that the relational database can be normalized to any degree.
Normalization through decomposition need to preserve the existence of two additional
properties of a relational schema:
Lossless or Nonadditive Join: Nonadditive join property guarantees that the spurious
tuple generation does not occur after decomposition
Dependency Preservation: Dependency preservation ensures that each functional
dependency is presented in one of the individual relation resulting after decomposition.
Although, there are also other higher level normalizations such as 5NF or PJNF, DKNF
and 6NF, most relational database designs are sufficiently normalized at BCNF level or even at 3NF.