0% found this document useful (0 votes)
6 views22 pages

Normalization

The document discusses the normalization of relational schemas, highlighting the issues of redundancy and anomalies that arise from improper design. It explains concepts such as functional dependencies, keys, and various normal forms (1NF, 2NF, 3NF, BCNF) that help reduce redundancy in databases. The text also covers Armstrong's axioms and the closure of attribute sets and functional dependencies to aid in understanding these normalization processes.

Uploaded by

yjzjbs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views22 pages

Normalization

The document discusses the normalization of relational schemas, highlighting the issues of redundancy and anomalies that arise from improper design. It explains concepts such as functional dependencies, keys, and various normal forms (1NF, 2NF, 3NF, BCNF) that help reduce redundancy in databases. The text also covers Armstrong's axioms and the closure of attribute sets and functional dependencies to aid in understanding these normalization processes.

Uploaded by

yjzjbs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Normalization

Logical Design of Relational Schemas

Erős Levente, 2024.


What is the problem with this
relation?
productID supplierName supplierAddr price
1 Erik és Fia Bt. Budapest, Fő utca 1 1000 HUF
2 Erik és Fia Bt. Budapest, Fő utca 1 3000 HUF
3 NASA 300 E. Street SW 90000000 HUF
WDC
4 Erik és Fia Bt. Budapest, Fő utca 1 2500 HUF
5 Erik és Fia Bt. Budapest, Fő utca 1 5000 HUF
6 NASA 300 E. Street SW 180000000 HUF
WDC
7 NASA 300 E. Street SW 70000000 HUF
WDC
8 NASA 300 E. Street SW 60000000 HUF
WDC
9 NASA 300 E. Street SW 7770000000 HUF
WDC
Introduction
• another way for creating relational schemas
• which was the 1st? ER  relational mapping
• there might be unnecessary data
• When is data unnecessary –
is supplier address (supplierAddr) unnecessary?
• Product(productID, supplierName, supplierAddr, price)
• supplierAddr is REDUNDANT (it can be figured out)
• Product(productID, supplierName, price)
Supplier(supplierName, supplierAddr)
• supplierAddr is NOT REDUNDANT
• check out next slide for redundancy
Examples
productID supplierName supplierAddr price
1 Erik és Fia Bt. Budapest, Fő utca 1 1000 HUF
2 Erik és Fia Bt. Budapest, Fő utca 1 3000 HUF

productID supplierName price


1 Erik és Fia Bt. 1000 HUF
2 Erik és Fia Bt. 3000 HUF

supplierName supplierAddr
Erik és Fia Bt. Budapest, Fő utca 1
Why is redundancy a problem?
• It causes anomalies
• Insertion anomaly
• can’t insert a piece of data without other pieces of data
• can’t insert supplier (name+address) without at least one product
• Deletion anomaly
• by deleting some data, other data is deleted too
• if you delete the last product of the supplier, its address disappears too
• Update anomaly
• changing the value of an attribute in one row means changing the
same attribute in other rows
• supplier changes address  all its supplierAddr values shall be changed
• resolved by decomposition (not discussed)
Functional dependencies (FDs)
• Example:
• supplier name defines supplier address ==
• supplier address is functionally dependent on supplier name
==
• if supplier name is repeated in a relation, supplier address is
repeated too
• Definition:
• Given is relational schema R that includes attribute sets X and
Y.
Y is functionally dependent on X, i.e. XY
• if in the case of any relation r(R)
• if an arbitrary pair of elements t and t’ of r
• if the values of X in t equal the values of X in t’, i.e. t[X]==t’[X]
Functional dependencies (FDs)
• Remarks

• XY can stand if X has no repeating values.

• occassional dependencies (each Bob in the table has brown


hair) are not real dependencies (each dentist has a university
diploma)

• FD XY only causes redundancy if X can be repeated,


otherwise not
Some definitions
• If X,Y ⊆ R and X → Y, but ∄X ′ ⊂ X so that X ′ → Y, then X
is the determinant of Y
• minimal (not minimum) set of attributes that define Y
• If X,Y ⊆ R and X → Y, but ∄X ′ ⊂ X so that X ′ → Y, then Y
is fully dependent on X (full dependency)
• the left side of the FD cannot be further reduced
• If X,Y ⊆ R and X → Y, and ∃X ′ ⊂ X so that X ′ → Y, then Y
is partially dependent on X (partial dependency)
• the left side of the FD can be reduced and will still define the
right side
Keys
• X is a superkey iff X defines the schema, i.e. XR
• X is a key if it is a superkey and none of its real subsets are
superkeys
• X is a key if the schema is fully dependent on it
• one key is primary key
• rest of the keys are candidate keys
• this distinction has no mathematical relevance

• A is a primary attribute if it is included by at least one


of the keys
• A is a secondary attribute if it is included by none of
the keys
True and deducible FDs
• The functional dependency set F includes all the
dependencies of a schema R
• A FD XY is true over a functional dependency set F if it
stands on all those relations r(R) on which the
dependencies of F stand. Notation: F ⊨ X → Y
• A FD XY is deducible from a functional dependency
set F if it can be obtained by repeatedly applying
Armstrong’s axioms on the FDs in F. Notation: F ⊢ W →
Z
• Theorem: Iff a FD is true over F then it is deducible from
F.
Armstrong’s axioms
• X ⊆ Y ⊨ Y → X, trivial dependency
• X → Y ∧ Y → Z ⊨ X → Z, transitivity
• this has (almost) nothing to do with
transitive dependency
• X → Y ⊨ XZ → YZ, expandability
Normal forms
• Goal: reducing redundancy
• Normal forms are defined
• each excludes certain types of redundancy
• If a relational schema R is in a given normal form,
then an arbitrary relation r(R) is guaranteed to exclude
the given types of redundancy
1NF
First normal form, 1NF 2NF
3NF
0NF
• Definition: Each attribute is atomic
BCNF
• 0NF includes non-atomic attributes
• Question: When is an attribute atomic?
• if we don’t need parts of attribute values
• depends on application, e.g. date (YYYY-MM-DD)
• is it atomic or not?

• Further normal forms will assume that the 1NF condition


is met.
Second normal form, 2NF
• Definition: 1NF and no secondary attribute depends on
a real subset of a key
• Equivalent definition: Each secondary attribute is fully
dependent on each key
• Counter example:
R(A,B,C)
F={BC}
key: ?
• Why is this redundant?
Finding keys
• Classify attributes based FDs in the FD set:
• appear only on left hand side – included in each key (nobody defines
them)
• appear only on right hand side – included in none of the keys
(defined by others)
• apper on neither side – included in each key (are only defined by
themselves)
• appear on both sides – left right both none
the magic happens here 
observations&brute force find keys B C - A
• Example: R(A,B,C), F={BC}
• A and B in all keys. Let’s check what AB defines: ABABC (itself plus
C)
Second normal form, 2NF
• Definition: 1NF and no secondary attribute depends on
a real subset of a key
• Equivalent definition: Each secondary attribute is fully
dependent on each key
• Counter example:
R(A,B,C)
F={BC}
key: AB
• Why is this redundant?
Second normal form, 2NF
• Definition: 1NF and no secondary attribute depends on
a real subset of a key
• Equivalent definition: Each secondary attribute is fully
dependent on each key
A B C
• Counter example:
R(A,B,C) Jake Bp U
F={BC}
key: AB Alice Bp U
• Why is this redundant? Bob Szl R
Transitive dependencies
• NOT a single dependency but a dependency system
• NOT to be confused with transitivity
• the cause of redundancy (as far as we’re concerned in
this course)
• A is transitively dependent on X if there is a Y for which
• XY
• Y⇸X (Y doesn’t define X)
• YA
• A∉Y
• Homework: Think of how this causes redundancy
Third normal form, 3NF
• Definitions
• 1NF and for each XA X is a superkey or A is prime attribute
• 1NF and no secondary attribute depends on any key transitively
• Counter example: A B C
• R(A,B,C) 1 y a
• F={AB, BC} 2 y A
• key: A, all other attributes: secondary, 3 x B
and C transitively depends on A
• Question: What’s the highest normal formS of R? C Z
• Example for 3NF: Józsefhegyi Budapest 102
• R(S(treet), C(ity), Z(IP code)) = R(S,C,Z) utca 5
• F={SCZ, ZC} Józsefhegyi Törökbáli 204
utca nt 5
• Question: Leftover reundancy?
Törökvész út Budapest 102
Boyce-Codd normal form, BCNF
• Definitions
• 1NF and for each XA X is a superkey S Z
Józsefhegyi
• 1NF and no attribute depends on any key 102
utca 5
transitively
Józsefhegyi 204
• Example for 3NF: utca 5
• R1(S,Z); R2(Z,C) Törökvész út 102
C 5
Z
• F1={}; F2{ZC}
Budapest 102
• Question: Problems? 5
• Btw. any 2-attribute schema is BCNF Törökbáli 204
nt 5
Closures
• The closure of attribute set X is the largest attribute set
W for which XW stands. Formally: X+(F) = {A|A ∈ R and F
⊨ X → A}
• calculated in linear time:
• X+=X initially
• in each step
• find a FD the left hand side of which is in X+, while its right hand side is not
• add the right hand side to X+

• The closure of a functional dependency set F is the


largest set of those FDs which can be deduced from F using
Armstrong’s axioms. Formally: F+ = {X → Y |F ⊨ X → Y}
• calculated in exponential time
Closures, examples
• F={AB, BC}
• F+={AB, BC, AA, BB, CC, ABAB, ACAC, BCBC,
ABCABC, ABA, ABB, ACA, ACC, BCB, BCC, ABCA,
ABCB, ABCC, AC, ABC, ACB, ABAC,...}
• A+(F)=?
• A+(F)0=A
• A+(F)1=AB (based on AB)
• A+(F)2=ABC (based on BC)
• A+(F)3= A+(F)2, no further expansion is possible, we’re done.
• Question: How to determine whether AC is in F+ or not?
• Solution I: Find F+ and see – takes a long time
• Solution II: Calculate A+(F) and see whether C is in it – efficient

You might also like