UNIT III Redundancy
UNIT III Redundancy
Schema Refinement
Problems caused by redundancy:
Insertion anomaly
Deletion anomaly
Updation anomaly
1
Insertion Anomaly
Himans 7300934
100 GEU 1
hu 851
Deletion Anomaly
If the details of students in this table are deleted then the details of
the college will also get deleted which should not occur by common
sense. This anomaly happens when the deletion of a data record
results in losing some unrelated information that was stored as part
of the record that was deleted from a table.
It is not possible to delete some information without losing some
other information in the table as well.
2
Updation Anomaly
Suppose the rank of the college changes then changes will have to be
all over the database which will be time-consuming and
computationally costly.
All places should be updated, If updation does not occur at all places
then the database will be in an inconsistent state.
Redundancy in a database occurs when the same data is stored in
multiple places. Redundancy can cause various problems such as
data inconsistencies, higher storage requirements, and slower data
retrieval.
Problems Caused Due to Redundancy:
4
wasted storage space and increased maintenance complexity. This
can also lead to confusion and errors, as different copies of the
data may have different values or be out of sync.
Data Integrity: Redundancy can also compromise data integrity,
as changes made to one copy of the data may not be reflected in
the other copies. This can result in inconsistencies and errors and
can make it difficult to ensure that the data is accurate and up-to-
date.
Usability Issues: Redundancy can also create usability issues, as
users may have difficulty accessing the correct version of the data
or may be confused by inconsistencies and errors. This can lead to
frustration and decreased productivity, as users spend more time
searching for the correct data or correcting errors.
To prevent redundancy in a database, normalization techniques can
be used. Normalization is the process of organizing data in a
database to eliminate redundancy and improve data
integrity. Normalization involves breaking down a larger table into
smaller tables and establishing relationships between them. This
reduces redundancy and makes the database more efficient and
reliable.
Advantages of data redundancy in DBMS
5
o Provides Data Reliability: Reliable data improves accuracy
because organizations can check and confirm whether data is
correct.
o Create Data Backup: Data redundancy helps in backing up the
data.
6
Decomposition
7
Decomposition in DBMS
Types of Decomposition
There are two types of Decomposition:
Lossless Decomposition
Lossy Decomposition
Types of Decomposition
Lossless Decomposition
If the information is not lost from the relation that is
decomposed, then the decomposition will be lossless.
8
The lossless decomposition guarantees that the join of
relations will result in the same relation as it was decomposed.
The relation is said to be lossless decomposition if natural
joins of all the decomposition give the original relation.
Example:
EMPLOYEE_DEPARTMENT table:
EMPLOYEE table:
EMP_ID EMP_NAME EMP_AGE EMP_CITY
22 Denim 28 Mumbai
33 Alina 25 Delhi
46 Stephan 30 Bangalore
52 Katherine 36 Mumbai
60 Jack 40 Noida
9
DEPARTMENT table
DEPT_ID EMP_ID DEPT_NAME
827 22 Sales
438 33 Marketing
869 46 Finance
575 52 Production
678 60 Testing
Now, when these two relations are joined on the common column
"EMP_ID", then the resultant relation will look like:
Employee ⋈ Department
EMP_I EMP_NA EMP_A EMP_CI DEPT_I DEPT_NA
D ME GE TY D ME
Example:
10
A B C
55 16 27
48 52 89
Now we decompose this relation into two sub relations R1 and R2
R1(A, B)
A B
55 16
48 52
R2(B, C)
B C
16 27
52 89
After performing the Join operation we get the same original relation
A B C
55 16 27
48 52 89
11
Example: Let's consider a table `R(A, B, C)` with a dependency `A
→ B`. If you decompose it into `R1(A, B)` and `R2(B, C)`, it would
be lossy because you can't recreate the original table using natural
joins.
R1(A, B):
|A |B |
|----|----|
|1 |X |
|1 |Y |
|2 |Z |
R2(A, C):
|A |C |
|----|----|
|1 |P |
|1 |P |
|2 |Q |
12
Now, if we take the natural join of R1 and R2 on attribute A, we get
back the original relation R. Therefore, this is a lossless
decomposition.
Dependency Preserving
o It is an important constraint of the database.
o In the dependency preservation, at least one decomposed table
must satisfy every dependency.
o If a relation R is decomposed into relation R1 and R2, then the
dependencies of R either must be a part of R1 or R2 or must be
derivable from the combination of functional dependencies of
R1 and R2.
o For example, suppose there is a relation R (A, B, C, D) with
functional dependency set (A->BC). The relational R is
decomposed into R1(ABC) and R2(AD) which is dependency
preserving because FD A->BC is a part of relation R1(ABC).
o Dependency Preservation: A Decomposition D = { R1, R2,
R3…Rn } of R is dependency preserving wrt a set F of
Functional dependency if
13
o there can be three cases:
o f1 U f2 = F -----> Decomposition is dependency preserving.
o f1 U f2 is a subset of F -----> Not Dependency preserving.
o f1 U f2 is a super set of F -----> This case is not possible.
14
15
Problem:
Let a relation R (A, B, C, D ) and functional dependency {AB
–> C, C –> D, D –> A}. Relation R is decomposed into R1( A, B, C)
and R2(C, D). Check whether decomposition is dependency
preserving or not.
Solution:
closure(A) = { A } // Trivial
closure(B) = { B } // Trivial
closure(C) = {C, A, D} but D can't be in closure as D is not present
R1.
= {C, A}
C--> A // Removing C from right side as it is trivial attribute
closure(AB) = {A, B, C, D}
= {A, B, C}
AB --> C // Removing AB from right side as these are trivial
attributes
closure(BC) = {B, C, D, A}
= {A, B, C}
16
BC --> A // Removing BC from right side as these are trivial
attributes
closure(AC) = {A, C, D}
NULL SET
17
Example: If you have the functional dependency `A → B` in the
original table, but in the decomposed tables, there is no table
with both `A` and `B`, this functional dependency can't be
preserved.
A→B
B→C
R1(A,B)withFDA→B
R2(B,C) with FD B → C
18
3. Increased Complexity
4. Redundancy
5. Performance Overhead
Functional Dependency
X → Y
19
Example:
Emp_Id → Emp_Name
Example:
ID → Name,
Name → DOB
Reasoning about Functional Dependancy:
Inference Rule (IR):
21
The Functional dependency has 6 types of inference rule:
If X ⊇ Y then X → Y
Example:
X = {a, b, c, d, e}
Y = {a, b, c}
If X → Y then XZ → YZ
Example:
If X → Y and Y → Z then X → Z
If X → Y and X → Z then X → YZ
22
Proof:
1.X→Y(given)
2.X→Z(given)
3.X→XY(using IR2 on 1 by augmentation with X. Where
XX=X)
4.XY→YZ(using IR2 on 2 by augmentation with Y)
5. X → YZ (using IR3 on 3 and 4)
5. Decomposition Rule (IR5):
If X → YZ then X → Y and X → Z
Proof:
1.X→YZ(given)
2.YZ→Y(usingIR1 Rule)
3. X → Y (using IR3 on 1 and 2)
6. Pseudo transitive Rule (IR6):
If X → Y and YZ → W then XZ → W
Proof:
1.X→Y(given)
2.WY→Z(given)
3.WX→WY(using IR2 on 1 by augmenting with W)
4. WX → Z (using IR3 on 3 and 2)
23
24