RDBMS Unit3 Informaldesign Guidelines
RDBMS Unit3 Informaldesign Guidelines
THEORY
https://www.youtube.com/watch?v=NFk9sDJk50U
https://www.youtube.com/watch?v=gInecSg-36Y
Contents
• The ease with which the meaning of a relation's attributes can be explained
is an informal measure of how well the relation is designed.
• Hence, all the relation schemas may be considered as easy to explain and
hence good from the standpoint of having clear semantics.
• We can thus formulate the following informal design guideline.
GUIDELINE 1
• Design a relation schema so that it is easy to explain its meaning.
• Do not combine attributes from multiple entity types and relationship types
into a single relation.
There is nothing wrong logically with these two relations, they are considered
poor designs because they violate Guideline 1 by mixing attributes from
distinct real-world entities
2. Redundant Information in Tuples and Update
Anomalies
One goal of schema design is to minimize the storage space used by the base
relations.
Insertion Anomalies:
An Insert Anomaly occurs when certain attributes cannot be inserted into the
database without the presence of other attributes.
Consider what happens if Student S30 is the last student to leave the course -
All information about the course is lost.
Modification Anomalies:
Consider Jones moving address - you need to update all instances of Jones's
address.
Based on the preceding three anomalies, we can state the guideline that
follows:
GUIDELINE 2
• If any anomalies are present, note them clearly and make sure that the
.
programs that update the database will operate correctly
3. Null Values in Tuples
• In some schema designs, we may group many attributes together into a
"fat" relation (More no. of attributes in a single relation where not all
attributes are totally functionally dependent on prime attribute).
• For Example: In a Student Relation, a student having multiple phone
numbers say phno1,phno2 and phno3. Only few students may have more
than 2 phone nos. so rest of the students will keep that attribute value as a
blank or NULL so we should try to avoid it.
• Another example: Department having multiple locations where not all the
department have more than one location so rest of the tuple values will be
filled with NULL
• If many of the attributes do not apply to all tuples in the relation, we end up
with many nulls in those tuples.
• For Example.: If Apartment no. is there in a relation and if you are not
living in a apartment then the value for that attribute will end up with
NULL as it is not applicable to you.
GUIDELINE 3:
For Example:
Let us consider two relation schema
Emp_Locs(ename, plocation)
Emp_proj1(eno, pnumber, hours, pname, plocation)
• If we attempt a natural join operation on above relation schema, the result
produces many more tuples than the original set of tuples.
• Additional tuples that were not there in Emp_proj1 are called spurious tuples
because they represent wrong information which is not valid.
The two relations EMP_PROJ1 and EMP_LOCS as the base relations of EMP_PROJ1,
is not a good schema design.
These additional tuples that were on it present in EMP_PROJ1 are called Spurious
Tuples because they represented spurious or wrong information that are not valid.
This is because the Plocation attribute which is used for joining the two relations is
neither a primary key, nor a foreign key in either EMP_LOC and EMP_PROJ1.
GUIDELINE 4
• Design relation schemas so that they can be joined with equality conditions
on attributes that are either primary keys or foreign keys in a way that
guarantees that no spurious tuples are generated.
• Avoid relations that contain matching attributes that are not (foreign key,
primary key) combinations, because joining on such attributes may
produce spurious tuples.
Functional Dependencies
Normal Forms Based on Primary Keys
Normalization of Relations
The normalization process, as first proposed by Codd (1972). Codd proposed three
main normal forms, which he called first, second, third normal form and Boyce-
Codd normal form (BCNF- an extension of 3NF).
All these normal forms are based on functional dependencies among the attributes
of a relation.
Later, a fourth normal form (4NF) and a fifth normal form (5NF) were proposed,
based on the concepts of multivalued dependencies and join dependencies,
respectively;
• Normalization of data can be considered a process of analyzing
the given relation schemas based on their FDs and primary keys to
achieve the desirable properties of
(1) minimizing redundancy and
(2) minimizing the insertion, deletion, and update anomalies.