0% found this document useful (0 votes)
0 views

unit-3-dbms

Uploaded by

justice.chitra.v
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

unit-3-dbms

Uploaded by

justice.chitra.v
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

Unit-3

• Guidelines for Relational Schema -


Functional dependency; Normalization,
Boyce Codd Normal Form, Multi-valued
dependency and Fourth Normal form; Join
dependency and Fifth Normal form.
Redundant Information in Tuples and Update
Anomalies
• Information is stored redundantly
– Wastes storage
– Causes problems with update anomalies
• Insertion anomalies
• Deletion anomalies
• Modification anomalies
EXAMPLE OF AN UPDATE ANOMALY

• Consider the relation:


– EMP_PROJ(Emp#, Proj#, Ename, Pname,
No_hours)
• Update Anomaly:
– Changing the name of project number P1 from
“Billing” to “Customer-Accounting” may cause this
update to be made for all 100 employees working
on project P1.
EXAMPLE OF AN INSERT ANOMALY

• Consider the relation:


– EMP_PROJ(Emp#, Proj#, Ename, Pname,
No_hours)
• Insert Anomaly:
– Cannot insert a project unless an employee is
assigned to it.
• Conversely
– Cannot insert an employee unless an he/she is
assigned to a project.
EXAMPLE OF AN DELETE ANOMALY

• Consider the relation:


– EMP_PROJ(Emp#, Proj#, Ename, Pname,
No_hours)
• Delete Anomaly:
– When a project is deleted, it will result in deleting
all the employees who work on that project.
– Alternately, if an employee is the sole employee
on a project, deleting that employee would result
in deleting the corresponding project.
Spurious Tuples
• Bad designs for a relational database may result
in erroneous results for certain JOIN operations
• The "lossless join" property is used to guarantee
meaningful results for join operations

– The relations should be designed to satisfy the


lossless join condition.
– No spurious tuples should be generated by doing a
natural-join of any relations.
Functional Dependencies (1)
• Functional dependencies (FDs)
– Are used to specify formal measures of the
"goodness" of relational designs
– And keys are used to define normal forms for
relations
– Are constraints that are derived from the meaning
and interrelationships of the data attributes
• A set of attributes X functionally determines a
set of attributes Y if the value of X determines a
unique value for Y
Examples of FD constraints (1)
• Social security number determines employee
name
– SSN -> ENAME
• Project number determines project name and
location
– PNUMBER -> {PNAME, PLOCATION}
• Employee ssn and project number determines
the hours per week that the employee works
on the project
– {SSN, PNUMBER} -> HOURS
Inference Rules for FDs (1)
• Given a set of FDs F, we can infer additional FDs that hold
whenever the FDs in F hold
• Armstrong's inference rules:
– IR1. (Reflexive) If Y subset-of X, then X -> Y
– IR2. (Augmentation) If X -> Y, then XZ -> YZ
• (Notation: XZ stands for X U Z)
– IR3. (Transitive) If X -> Y and Y -> Z, then X -> Z

• IR1, IR2, IR3 form a sound and complete set of inference rules
– These are rules hold and all other rules that hold can be
deduced from these
Inference Rules for FDs (2)
• Some additional inference rules that are
useful:
– Decomposition: If X -> YZ, then X -> Y and X -> Z
– Union: If X -> Y and X -> Z, then X -> YZ
– Psuedotransitivity:
If X -> Y and WY -> Z, then WX -> Z

• The last three inference rules, as well as any


other inference rules, can be deduced from
IR1, IR2, and IR3 (completeness property)
First Normal Form
• Disallows
– composite attributes
– multivalued attributes
• Considered to be part of the definition of
relation
Normalization into 1NF
Second Normal Form (1)
• Uses the concepts of FDs, candidate key
• Definitions
– Prime attribute: An attribute that is member of the candidate
key K
– Full functional dependency: a FD Y -> Z where removal of any
attribute from Y means the FD does not hold any more
• Examples:
– {SSN, PNUMBER} -> HOURS is a full FD since neither SSN ->
HOURS nor PNUMBER -> HOURS hold
– {SSN, PNUMBER} -> ENAME is not a full FD (it is called a partial
dependency ) since SSN -> ENAME also holds
Second Normal Form (2)
• A relation schema R is in second normal form
(2NF) if every non-prime attribute A in R is
fully functionally dependent on the candidate
key
• R can be decomposed into 2NF relations via
the process of 2NF normalization
Normalizing into 2NF
Third Normal Form (1)
• Definition:
– Transitive functional dependency: a FD X -> Z
that can be derived from two FDs X -> Y and Y ->
Z
• Examples:
– SSN -> DMGRSSN is a transitive FD
• Since SSN -> DNUMBER and DNUMBER -> DMGRSSN
hold
– SSN -> ENAME is non-transitive
• Since there is no set of attributes X where SSN -> X and
X -> ENAME
Third Normal Form (2)
• A relation schema R is in third normal form (3NF) if it is in 2NF
and no non-prime attribute A in R is transitively dependent on
the candidate key
• R can be decomposed into 3NF relations via the process of
3NF normalization
• NOTE:
– In X -> Y and Y -> Z, with X as the primary key, we consider this a
problem only if Y is not a candidate key.
– When Y is a candidate key, there is no problem with the
transitive dependency .
– E.g., Consider EMP (SSN, Emp#, Salary ).
• Here, SSN -> Emp# -> Salary and Emp# is a candidate key.
Normalizing into 3NF
Normalization into 2NF and 3NF
General Normal Form Definitions (2)
• Definition:
– Superkey of relation schema R - a set of attributes
S of R that contains a key of R
– A relation schema R is in third normal form (3NF)
if whenever a FD X -> A holds in R, then either:
• (a) X is a superkey of R, or
• (b) A is a prime attribute of R
• NOTE: Boyce-Codd normal form disallows
condition (b) above
Boyce-Codd Normal Form
• A relation schema R is in BCNF, If FD X -> A is
non—trivial FD, then X is a super key.
Multivalued Dependencies and Fourth Normal
Form
Definition:
• A multivalued dependency (MVD) X —>> Y specified on relation schema
R, where X and Y are both subsets of R, specifies the following constraint
on any relation state r of R: If two tuples t1 and t2 exist in r such that t1[X]
= t2[X], then two tuples t3 and t4 should also exist in r with the following
properties, where we use Z to denote (R 2 (X υ Y)):
– t3[X] = t4[X] = t1[X] = t2[X].
– t3[Y] = t1[Y] and t4[Y] = t2[Y].
– t3[Z] = t2[Z] and t4[Z] = t1[Z].
• An MVD X —>> Y in R is called a trivial MVD if (a) Y is a subset of X, or (b)
X υ Y = R.
Multivalued Dependencies and Fourth Normal
Form
Definition:
• A relation schema R is in 4NF with respect to a set of
dependencies F (that includes functional dependencies and
multivalued dependencies) if, for every nontrivial
multivalued dependency X —>> Y in F+, X is a superkey for R.
– Note: F+ is the (complete) set of all dependencies (functional
or multivalued) that will hold in every relation state r of R that
satisfies F. It is also called the closure of F.
Multivalued Dependencies and Fourth Normal
Form
(a) The EMP relation with two MVDs: ENAME —>> PNAME and
ENAME —>> DNAME.
(b) Decomposing the EMP relation into two 4NF relations
EMP_PROJECTS and EMP_DEPENDENTS.
4NF
• A relation is in 4NF, if it is in BCNF and contains
no multivalued dependency
• For every MVD, X —>> Y
• Either
– Trivial
• Or
– X is a super key
Multivalued Dependencies and Fourth Normal Form

Decomposing a relation state of EMP that is not in 4NF:


(a) EMP relation with additional tuples.
(b) Two corresponding 4NF relations EMP_PROJECTS and
EMP_DEPENDENTS.
5NF
• A relation R must be in 4NF and
– join dependency should not be there
– Else if, JD should be in trivial
– Else every Ri isClick
a super key
to add text
Fifth Normal Form
Class work
Click to add text
Click to add text

You might also like