0% found this document useful (0 votes)
56 views14 pages

Unit 6 DBMS

The document discusses database normalization. It defines normalization as organizing data and attributes to reduce redundancy. Normalization is achieved through four normal forms - 1NF, 2NF, 3NF, and BCNF. Each normal form introduces additional rules to reduce anomalies like insertion, update, and deletion anomalies. The document provides examples to demonstrate how a database can be normalized by decomposing tables and removing dependencies between attributes and keys in accordance with the normal form rules.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views14 pages

Unit 6 DBMS

The document discusses database normalization. It defines normalization as organizing data and attributes to reduce redundancy. Normalization is achieved through four normal forms - 1NF, 2NF, 3NF, and BCNF. Each normal form introduces additional rules to reduce anomalies like insertion, update, and deletion anomalies. The document provides examples to demonstrate how a database can be normalized by decomposing tables and removing dependencies between attributes and keys in accordance with the normal form rules.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Unit – VI

Introduction

Normalization is the process of organizing the data and the attributes of a database. It
is performed to reduce the data redundancy in a database and to ensure that data is
stored logically.

Why Do We Need Normalization?


As we have discussed above, normalization is used to reduce data redundancy. It provides a
method to remove the following anomalies from the database and bring it to a more consistent
state:

A database anomaly is a flaw in the database that occurs because of poor planning and
redundancy.

1. Insertion anomalies: This occurs when we are not able to insert data into a database
because some attributes may be missing at the time of insertion.
2. Updation anomalies: This occurs when the same data items are repeated with the same
values and are not linked to each other.
3. Deletion anomalies: This occurs when deleting one part of the data deletes the other
necessary information from the database.

Normal Forms
There are four types of normal forms that are usually used in relational databases as you can see
in the following figure:
First Normal Form (1NF)
A relation is in 1NF if every attribute is a single-valued attribute or it does not contain any multi-
valued or composite attribute, i.e., every attribute is an atomic attribute. If there is a composite or
multi-valued attribute, it violates the 1NF. To solve this, we can create a new row for each of the
values of the multi-valued attribute to convert the table into the 1NF.

Let’s take an example of a relational table <EmployeeDetail> that contains the details of the
employees of the company.
<EmployeeDetail>

Employee Code Employee Name Employee Phone Number


101 Amit 98765623,998234123
101 Amit 89023467
102 Sumit 76213908
103 Rohit 98132452

Here, the Employee Phone Number is a multi-valued attribute. So, this relation is not in 1NF.

To convert this table into 1NF, we make new rows with each Employee Phone Number as a new
row as shown below:

<EmployeeDetail>

Employee Code Employee Name Employee Phone Number


101 Amit 998234123
101 Amit 98765623
101 Amit 89023467
102 Sumit 76213908
103 Rohit 98132452

Second Normal Form (2NF)


The normalization of 1NF relations to 2NF involves the elimination of partial dependencies.
A partial dependency in DBMS exists when any non-prime attributes, i.e., an attribute not a part
of the candidate key, is not fully functionally dependent on one of the candidate keys.

For a relational table to be in second normal form, it must satisfy the following rules:

1. The table must be in first normal form.


2. It must not contain any partial dependency, i.e., all non-prime attributes are fully
functionally dependent on the primary key.

<EmployeeProjectDetail>

Employee Code Project ID Employee Name Project Name


101 P03 John Project103
101 P01 John Project101
102 P04 Ryan Project104
Employee Code Project ID Employee Name Project Name
103 P02 Stephanie Project102

3. In the above table, the prime attributes of the table are Employee Code and Project ID.
We have partial dependencies in this table because Employee Name can be determined
by Employee Code and Project Name can be determined by Project ID. Thus, the above
relational table violates the rule of 2NF.
4. To remove partial dependencies from this table and normalize it into second normal form,
we can decompose the <EmployeeProjectDetail> table into the following three tables:

<EmployeeDetail>

Employee Code Employee Name


101 John
101 John
102 Ryan
103 Stephanie

<EmployeeProject>

Employee Code Project ID


101 P03
101 P01
102 P04
103 P02

<ProjectDetail>

Project ID Project Name


P03 Project103
P01 Project101
P04 Project104
P02 Project102

5. Thus, we’ve converted the <EmployeeProjectDetail> table into 2NF by decomposing it


into <EmployeeDetail>, <ProjectDetail> and <EmployeeProject> tables. As you can see,
the above tables satisfy the following two rules of 2NF as they are in 1NF and every non-
prime attribute is fully dependent on the primary key.

Third Normal Form (3NF)


The normalization of 2NF relations to 3NF involves the elimination of transitive dependencies in
DBMS.

A functional dependency X -> Z is said to be transitive if the following three functional


dependencies hold:

 X -> Y
 Y does not -> X
 Y -> Z

For a relational table to be in third normal form, it must satisfy the following rules:

1. The table must be in the second normal form.


2. No non-prime attribute is transitively dependent on the primary key.
3. For each functional dependency X -> Z at least one of the following conditions hold:

 X is a super key of the table.


 Z is a prime attribute of the table.

If a transitive dependency exists, we can divide the table to remove the transitively dependent
attributes and place them to a new table along with a copy of the determinant.

<EmployeeDetail>

Employee Code Employee Name Employee Zipcode Employee City


101 John 110033 Model Town
101 John 110044 Badarpur
102 Ryan 110028 Naraina
103 Stephanie 110064 Hari Nagar

The above table is not in 3NF because it has Employee Code -> Employee City transitive
dependency because:

 Employee Code -> Employee Zipcode


 Employee Zipcode -> Employee City

Also, Employee Zipcode is not a super key and Employee City is not a prime attribute.

<EmployeeDetail>

Employee Code
\Employee Name Employee Zipcode
101 John 110033
Employee Code
\Employee Name Employee Zipcode
101 John 110044
102 Ryan 110028
103 Stephanie 110064

<EmployeeLocation>

Employee Zipcode Employee City


110033 Model Town
110044 Badarpur
110028 Naraina
110064 Hari Nagar

Boyce-Codd Normal Form (BCNF)


Boyce-Codd Normal Form(BCNF) is an advanced version of 3NF as it contains additional
constraints compared to 3NF.

For a relational table to be in Boyce-Codd normal form, it must satisfy the following rules:

1. The table must be in the third normal form.


2. For every non-trivial functional dependency X -> Y, X is the superkey of the table. That
means X cannot be a non-prime attribute if Y is a prime attribute.

A superkey is a set of one or more attributes that can uniquely identify a row in a database table.

<EmployeeProjectLead>

Employee Code Project ID Project Leader


101 P03 HMS
101 P01 SMS
102 P04 LMS
103 P02 FMS

<EmployeeProjectLead>
Employee Code Project ID
Project Leader
101 P03 HMS
101 P01 SMS
102 P04 LMS
103 P02 FMS
Transactions
Transactions refer to a set of operations that are used for performing a set of logical work.
Usually, a transaction means the data present in the DB has changed. Protecting the user data
from system failures is one of the primary uses of DBMS.

Example: Suppose an employee of bank transfers Rs 800 from X's account to Y's account. This
small transaction contains several low-level tasks:

X's Account

1. Open_Account(X)
2. Old_Balance = X.balance
3. New_Balance = Old_Balance - 800
4. X.balance = New_Balance
5. Close_Account(X)

Y's Account

1. Open_Account(Y)
2. Old_Balance = Y.balance
3. New_Balance = Old_Balance + 800
4. Y.balance = New_Balance
5. Close_Account(Y)

Operations in Transaction
A certain set of operations takes place when a transaction is done that is used to perform some
logical set of operations. For example: When we go to withdraw money from ATM, we
encounter the following set of operations:

1. Transaction Initiated
2. You have to insert an ATM card
3. Select your choice of language
4. Select whether savings or current account
5. Enter the amount to withdraw
6. Entering your ATM pin
7. Transaction processes
8. You collect the cash
9. You press finish to end transaction
The above mentioned are the set of operations done by you. But in the case of a transaction in
DBMS there are three major operations that are used for a transaction to get executed in an
efficient manner. These are:

1. Read / Access Data

2. Write / Change Data

3. Commit

Let's understand the above three sets of operations in a transaction with a real-life example of
transferring money from Account1 to Account2.

Initial balance in both the banks before the start of the transaction

Account1 = 5000

Account2 = 2000

This data before the start of the transaction is stored in the secondary memory (Hard disk) which
once initiated is bought to the primary memory (RAM) of the system for faster and better access.

Now for a transfer of Rs. 500 from Account1 to Account2 to occur, the following set of
operations will take place.

Read (Account1) --> 5000

Account1 = Account1 - 500

Write (Account1) --> 4500

Read (Account2) --> 2000

Account2 = Account2 + 500

Write (Account2) --> 2500

commit

The main problem that can happen during a transaction is that the transaction can fail before
finishing the all the operations in the set. This can happen due to power failure, system crash etc.
This is a serious problem that can leave database in an inconsistent state. Assume that transaction fail
after third operation (see the example above) then the amount would be deducted from your account
but your friend will not receive it.

To solve this problem, we have the following two operations –


Commit: If all the operations in a transaction are completed successfully then commit those changes
to the database permanently.
Rollback: If any of the operation fails then rollback all the changes done by previous operations.

Even though these operations can help us avoiding several issues that may arise during transaction
but they are not sufficient when two transactions are running concurrently. To handle those problems
we need to understand database ACID properties.
ACID Properties in DBMS
A transaction is a single logical unit of work which accesses and possibly modifies the
contents of a database. Transactions access data using read and write operations.
In order to maintain consistency in a database, before and after the transaction, certain
properties are followed. These are called ACID properties.

Atomicity
It states that all operations of the transaction take place at once if not, the
transaction is aborted. There is no midway, i.e., the transaction cannot occur
partially. Each transaction is treated as one unit and either run to completion or is
not executed at all.
Atomicity involves the following two operations:

Abort: If a transaction aborts then all the changes made are not visible.

Commit: If a transaction commits then all the changes made are visible.

Example: Let's assume that following transaction T consisting of T1 and T2. A consists of
Rs 600 and B consists of Rs 300. Transfer Rs 100 from account A to account B.

T1 T2

Read(A) Read(B)
A:=A-100 Y:=Y+100
Write(A) Write(B)

After completion of the transaction, A consists of Rs 500 and B consists of Rs 400.

If the transaction T fails after the completion of transaction T1 but before completion of
transaction T2, then the amount will be deducted from A but not added to B. This shows
the inconsistent database state. In order to ensure correctness of database state, the
transaction must be executed in entirety.

Consistency
The integrity constraints are maintained so that the database is consistent before
and after the transaction. The execution of a transaction will leave a database in
either its prior stable state or a new stable state. The consistent property of
database states that every transaction sees a consistent database instance. The
transaction is used to transform the database from one consistent state to
another consistent state.

For example: The total amount must be maintained before or after the transaction.

Total before T occurs = 600+300=900


Total after T occurs= 500+400=900
Isolation
This property ensures that multiple transactions can occur concurrently without
leading to the inconsistency of database state. Transactions occur independently
without interference. Changes occurring in a particular transaction will not be
visible to any other transaction until that particular change in that transaction is
written to memory or has been committed.

Let X= 500, Y = 500.

Consider two transactions T1 and T2.

T1 T2
Read(X) Read (X)
X:=X*100 Read (Y)
Write (X) Z:=X+Y
Ready (Y) Write (Z)
Y:=Y-50
Write
Suppose T1 has been executed till Read (Y) and then T2 starts. As a result ,
interleaving of operations takes place due to which T2 reads correct value of X
but incorrect value of Y and sum computed by

T2: (X+Y = 50, 000+500=50, 500)

is thus not consistent with the sum at end of transaction:

T1: (X+Y = 50, 000 + 450 = 50, 450).

This results in database inconsistency, due to a loss of 50 units. Hence,


transactions must take place in isolation and changes should be visible only after
they have been made to the main memory.
Durability:
This property ensures that once the transaction has completed execution, the
updates and modifications to the database are stored in and written to disk and
they persist even if a system failure occurs. These updates now become
permanent and are stored in non-volatile memory. The effects of the transaction,
thus, are never lost.

Transaction States in DBMS


During the lifetime of a transaction, there are a lot of states to go through. These
states update the operating system about the current state of the transaction and
also tell the user about how to plan further processing of the transaction. These
states decide the regulations which decide the fate of a transaction whether it will
commit or abort.

Following are the different types of transaction States :

Active State: When the operations of a transaction are running then the
transaction is said to be active state. If all the read and write operations are
performed without any error then it progresses to the partially committed state, if
somehow any operation fails, then it goes to a state known as failed state.

Partially Committed: After all the read and write operations are completed, the
changes which were previously made in the main memory are now made
permanent in the database, after which the state will progress to committed
state but in case of a failure it will go to the failed state.

Failed State: If any operation during the transaction fails due to some software
or hardware issues, then it goes to the failed state . The occurrence of a failure
during a transaction makes a permanent change to data in the database. The
changes made into the local memory data are rolled back to the previous
consistent state.

Aborted State: If the transaction fails during its execution, it goes from failed
state to aborted state and because in the previous states all the changes were
only made in the main memory, these uncommitted changes are either deleted
or rolled back. The transaction at this point can restart and start afresh from the
active state.

Committed State: If the transaction completes all sets of operations successfully,


all the changes made during the partially committed state are permanently
stored and the transaction is stated to be completed, thus the transaction can
progress to finally get terminated in the terminated state.

Terminated State: If the transaction gets aborted after roll-back or the


transaction comes from the committed state, then the database comes to a
consistent state and is ready for further new transactions since the previous
transaction is now terminated.

You might also like