CH 11
CH 11
top-down design: designing a conceptual schema in a high-level data model (ER model)
and mapping to a set of relations
Strict decomposition: start with one giant relation schema - universal relation:
1) each relation Ri in R to be in BCNF or 3NF (not good enough by itself for good design)
2) attribute preservation: make sure no attributes are lost in the decomposition, i.e.
R = union of all Ri
- more formally:
projection of F on Ri:
ΠF(Ri) = {X -> Y in F+ | (all Aj in X U Y) contained in Ri}
- decomposition D = {R1, R2, ..., Rm} of R has the lossless (nonadditive) join
property with respect to F on R if for every relation state r of R that
satisfies F we have:
- Cannot guarantee all goals, do the best we can and deal with resulting
anomalies when they arise
ex: if there is a toy in the TOY relation with a null MAN_ID, a query
involving a join of the TOY with the MANUFACTURER relation would
leave out that TOY
- an outer join could solve this problem - tuples with null values on join
attributes still appear - but may give you more info than you want
- nulls can also cause problems with aggregate functions - how to interpret them
- Be cautious when assigning null to an attribute - especially foreign keys
Dangling Tuples: Assume that some entity is represented in more than one relation (may
happen if relations are fragmented on distributed dbs) - if a tuple exists
in one but not another, it is called a dangling tuple
- this could happen if we choose another alternative to using null values - leaving
out the tuple
TOY_2(TOY_NUM, MAN_ID)
- the first relation keeps all toy info, the second keeps pointer to
manufacturer relation
- if we have toys with no man_id as in above example, they may be left
out of TOY_2, and exist in TOY_1
Discussion:
- algorithms are not deterministic: ex: minimal cover is not necessarily unique -
so there may be more than one result depending on how minimal cover
is defined
multivalued dependencies:
- whenever two independent 1:N relationships A:B, and A:C are mixed in the
same relation, an MVD may arise
Ex: Assume that each TOY relation contains information about the warehouses
in which it is stored (each with a code name W1, W2,...), and that the
same toy may go by more than one name:
Algorithm 13.5 produces 4NF relations with the lossless join property
- does not necessarily preserve functional dependencies
- always replaces a non-4NF relation schema with 2 new ones and
iterates
5NF - permits lossless decomposition into more than two relations (see book for more if
interested)
- An idea of what a good design is by examining inconsistencies that could arise under
normal operations (insert, delete, modify)
- Decomposition of relations
- Can cause extreme decomposition - have to use too many joins. What about more
physical relationships between tables? More on this later.