C20.0046: Database Management Systems Lecture #5: M.P. Johnson Stern School of Business, NYU Spring, 2008
C20.0046: Database Management Systems Lecture #5: M.P. Johnson Stern School of Business, NYU Spring, 2008
0046: Database
Management Systems
Lecture #5
M.P. Johnson
Stern School of Business, NYU
Spring, 2008
AA1,,AA2,,…,
…, AAn BB1,,BB2,,…,
…, BBm
Notation: 1 2 n 1 2 m
name
name color
color
If all these FDs are true: category
category department
department
color,
color, category price
category price
Why?
Start
Startwith
withX={A
X={A11,,…,
…,AAnn}.}. Example:
name
name color
color
Repeat:
Repeat: category
category department
department
color,
color, category price
category price
ifif BB11,,…,
…,BBnn
CC isisaaFD
FD and
and
BB1,,…, B are all in X
1 …, Bnn are all in X {name, category}+ =
then
then add
add CCto toX.
X. {name, category, color,
department, price}
until
until XXdidn’t
didn’t change
change
Example: A,
A, BB
CC
A,
A, DD
BB
BB DD
AA++ == A,
A, BB+ ==BD,
+
BD, CC+ == C,
+
C, DD+ == DD
+
AB
AB+ ==ABCD,
+
ABCD, AC AC+== AC,
+
AC, ADAD+ == ABCD,
+
ABCD, BC+ BC+==BC,
BC,BD+
BD+==BD,
BD,CD+
CD+==CD
CD
ABC
ABC+ ==ABD
+
ABD+ == ACD
+
ACD+==ABCD
+
ABCD((nononeed
needto
tocompute–why?)
compute–why?)
BCD
BCD = BCD, ABCD == ABCD
++
= BCD, ABCD ++
ABCD
What are the keys?
M.P. Johnson, DBMS, Stern/NYU, Spring 2008 7
Closure alg e.g.
In class:
R(A,B,C,D,E,F) A,
A, BB
CC
A,
A, DD
EE
BB DD
A,
A, FF
BB
FDs are:
Keys are: {name, category}
FDs are:
Keys are:
Decomposition by projection
BCNF
Lossy v. lossless
Update anomalies:
Change info in one tuple but not in another
Deletion anomalies:
Delete some values & lose other values too
Insert anomalies:
Inserting row means having to insert other, separate info /
null-ing it out
SSN
SSN Name,
Name,Mailing-address
Mailing-address SSN
SSN Phone
Phone
Redundancy: name, maddress
Update anomaly: Bill moves
Delete anom.: Bill doesn’t pay bills, lose phones lose Bill!
Insert anom: can’t insert someone without a (non-null) phone
Underlying cause: SSN-phone is many-many
Effect: partial dependency ssn name, maddress,
Whereas key = {ssn,phone}
M.P. Johnson, DBMS, Stern/NYU, Spring 2008 12
Decomposition by projection
Soln: replace anomalous R with projections of R onto
two subsets of attributes
Projection: an operation in Relational Algebra
Corresponds to SELECT command in SQL
RR11(A
(A11,,...,
...,AAnn,,BB11,,...,
...,BBmm)) RR22(A
(A11,, ...,
...,AAnn,, CC11,,...,
...,CCpp))
Relational Model:
plus FD’s
Normalization:
Eliminates anomalies
R1(A,B) R2(A,C)
Recover
WP Word WP 100
DB Oracle DB 1000
DB Access DB 100
X Y Z
1 2 3
4 2 5
R1 R2
M.P. Johnson, DBMS, Stern/NYU, Spring 2008 26
Boyce-Codd Normal Form
Name/phone example is not BCNF:
Name SSN Mailing-address Phone
Michael 123 NY 212-111-1111
Michael 123 NY 917-111-1111
{ssn,phone} is key
FD: ssn name,mailing-address holds
Violates BCNF: ssn is not a superkey
Its decomposition is BCNF
Only superkeys anything else
Name SSN Mailing-address SSN PhoneNumber
Michael 123 NY 123 212-111-1111
123 917-111-1111
Resulting relations:
R1 R2
Theater N’hood Theater Title
Angelica Village Angelica Aviator
Angelica Life Aquatic
Data-lossy v. FD-lossy
AArelation
relationRRisisin
in3rd
3rdnormal
normalform
form ifif::
For
Forevery
everynontrivial
nontrivialdependency ...,AAnn
dependencyAA11,,AA22,,..., BB
for
for R,
R, {A
{A11,, AA22,, ...,
..., AAnn }}isisaasuper-key
super-keyfor
forR,
R,
or
or BBisispart
partof ofaakey,key, i.e.,
i.e., BBisis prime
prime
Tradeoff:
BCNF = no FD anomalies, but may lose some FDs
3NF = keeps all FDs, but may have some anomalies