0% found this document useful (0 votes)
34 views8 pages

DS Unit 2

Uploaded by

gopabandhukamila
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views8 pages

DS Unit 2

Uploaded by

gopabandhukamila
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

B C A 4th SEM DATABASE ENGINEERING

Module-III
Functional Dependency
The functional dependency is a relationship that exists between two attributes. It
typically exists between the primary key and non-key attribute within a table.
1. X → Y
The left side of FD is known as a determinant, the right side of the production is
known as a dependent.
For example:
Assume we have an employee table with attributes: Emp_Id, Emp_Name,
Emp_Address.
Here Emp_Id attribute can uniquely identify the Emp_Name attribute of employee
table because if we know the Emp_Id, we can tell that employee name associated
with it.
Functional dependency can be written as:
1. Emp_Id → Emp_Name
We can say that Emp_Name is functionally dependent on Emp_Id.

Types of Functional dependency

1. Trivial functional dependency


o A → B has trivial functional dependency if B is a subset of A.
o The following dependencies are also trivial like: A → A, B → B
Example:
1. Consider a table with two columns Employee_Id and Employee_Name.
2. {Employee_id, Employee_Name} → Employee_Id is a trivial functional dependen
cy as
3. Employee_Id is a subset of {Employee_Id, Employee_Name}.
4. Also, Employee_Id → Employee_Id and Employee_Name → Employee_Name ar
e trivial dependencies too.
2. Non-trivial functional dependency

Gopabandhu kamila – MCA(Comp. Sc) ,UNIPOWER,BDK - 6290829720


B C A 4th SEM DATABASE ENGINEERING

o A → B has a non-trivial functional dependency if B is not a subset of A.


o When A intersection B is NULL, then A → B is called as complete non-trivial.
Example:
1. ID → Name,
2. Name → DOB
What are Anomalies in DBMS
Normalization is necessary if you do not do it then the overall integrity of the data
stored in the database will eventually degrade. Specifically, this is due to data
anomalies. These anomalies in DBMS naturally occur and result in data that does
not match the real-world database purports to represent.
 What is anomalies in dbms.
 How we remove anomalies.
 What is insertion anomalies in dbms.
 What do you mean by anomalies in dbms.
Anomalies in DBMS are caused when there is too much redundancy in the
database‟s information. Anomalies can often be caused when the tables that make
up the database suffer from poor construction. So, what does “poor construction”
mean? Poor table design will become evident if, when the designer creates the
database, he doesn‟t identify the entities that depend on each other for existence,
like the rooms of a hotel and the hotel, and then minimize the chance that one would
ever exist independent of the other.
The normalization process was created largely in order to reduce the negative
effects of creating tables that will introduce anomalies into the database.
Example:
Suppose a manufacturing company stores the employee details in a table named
employee that has four attributes: emp_id for storing employee‟s id, e_name for
storing employee‟s name, e_address for storing employee‟s address, and e_dept for
storing the department details in which the employee works. At some point in time
the table looks like this:
e_id e_name e_address e_dept

e_id e_name e_address e_dept

101 Rick Delhi D001

101 Rick Delhi D002

123 Maggie Agra D890

166 Glenn Chennai D900

166 Glenn Chennai D004

The above table is not normalized. We will see the problems that we face when the
table is not normalized.

Gopabandhu kamila – MCA(Comp. Sc) ,UNIPOWER,BDK - 6290829720


B C A 4th SEM DATABASE ENGINEERING

Type of Anomalies in DBMS Anomalies in dbms


Update anomaly:
In the above table, we have two rows for employee Rick as he belongs to two
departments of the company. If we want to update the address of Rick then we have
to update the same in two rows or the data will become inconsistent.
If somehow, the correct address gets updated in one department but not in other
then as per the database, Rick would be having two different addresses, which is not
correct and would lead to inconsistent data.
Insert anomaly:
Suppose a new employee joins the company, who is under training and currently not
assigned to any department then we would not be able to insert the data into the
table if the e_dept field doesn‟t allow nulls.
Delete anomaly:
Suppose, if at a point of time the company closes the department D890 then deleting
the rows that are having e_dept as D890 would also delete the information of
employee Maggie since she is assigned only to this department.

What Is Database Normalization?


Database normalization or SQL normalization helps us to group related data in one
single table. Any attributive data or indirectly related data are put in different tables
and these tables are connected with a logical relationship between parent and child
tables.
In 1970, Edgar F. Codd came up with the concept of normalization. He shared a
paper named “A Relational Model of Data for Large Shared Banks” in which he
proposed “First Normal Form (1NF)”.
Advantages Of DBMS Normalization
Database Normalization provides the following basic advantages:
1. Normalization increases data consistency as it avoids the duplicity of data by
storing the data in one place only.
2. Normalization helps in grouping like or related data under the same schema,
thereby resulting in the better grouping of data.
3. Normalization improves searching faster as indexes can be created faster.
Hence, the normalized database or table is used for OLTP (Online
Transaction Processing).

Disadvantages Of Database Normalization


DBMS Normalization has the following disadvantages:
1. We cannot find the associated data for, say a product or employee in one
place and we have to join more than one table. This causes a delay in
retrieving the data.
2. Thus, Normalization is not a good option in OLAP transactions (Online
Analytical Processing).
Before we proceed further, let’s understand the following terms:
 Entity: Entity is a real-life object, where the data associated with such an
object is stored in the table. The example of such objects are employees,
departments, students, etc.
 Attributes: Attributes are the characteristics of the entity, that give some
information about the Entity. For Example, if tables are entities, then the
columns are their attributes.

Gopabandhu kamila – MCA(Comp. Sc) ,UNIPOWER,BDK - 6290829720


B C A 4th SEM DATABASE ENGINEERING

Types Of Normal Forms


#1) 1NF (First Normal Form)
By definition, an entity that does not have any repeating columns or data groups can
be termed as the First Normal Form. In the First Normal Form, every column is
unique.
Following is how our Employees and Department table would have looked if in
first normal form (1NF):
empNum lastName firstName deptName deptCity deptCountry

1001 Andrews Jack Accounts New York United States

1002 Schwatz Mike Technology New York United States

1009 Beker Harry HR Berlin Germany

1007 Harvey Parker Admin London United Kingdom

1007 Harvey Parker HR London United Kingdom


Here, all the columns of both Employees and Department tables have been clubbed
into one and there is no need of connecting columns, like deptNum, as all data is
available in one place.
But a table like this with all the required columns in it, would not only be difficult to
manage but also difficult to perform operations on and also inefficient from the
storage point of view.
#2) 2NF (Second Normal Form)
By definition, an entity that is 1NF and one of its attributes is defined as the primary
key and the remaining attributes are dependent on the primary key.
Following is an example of how the employees and department table would
look like:
Employees Table:
empNum lastName firstName

1001 Andrews Jack

1002 Schwatz Mike

1009 Beker Harry

1007 Harvey Parker

1007 Harvey Parker

Gopabandhu kamila – MCA(Comp. Sc) ,UNIPOWER,BDK - 6290829720


B C A 4th SEM DATABASE ENGINEERING

Departments Table:
deptNum deptName deptCity deptCountry

1 Accounts New York United States

2 Technology New York United States

3 HR Berlin Germany

4 Admin London United Kingdom


EmpDept Table:
empDeptID empNum deptNum

1 1001 1

2 1002 2

3 1009 3

4 1007 4

5 1007 3
Here, we can observe that we have split the table in 1NF form into three different
tables. the Employees table is an entity about all the employees of a company and
its attributes describe the properties of each employee. The primary key for this table
is empNum.
Similarly, the Departments table is an entity about all the departments in a company
and its attributes describe the properties of each department. The primary key for
this table is the deptNum.
In the third table, we have combined the primary keys of both the tables. The primary
keys of the Employees and Departments tables are referred to as Foreign keys in
this third table.
If the user wants an output similar to the one, we had in 1NF, then the user has to
join all the three tables, using the primary keys.
A sample query would look as shown below:
SELECT empNum, lastName, firstName, deptNum, deptName, deptCity, deptCountry
FROM Employees A, Departments B, EmpDept C
WHERE A.empNum = C.empNum
AND B.deptNum = C.deptNum
WITH UR;
#3) 3NF (Third Normal Form)
By definition, a table is considered in third normal if the table/entity is already in the
second normal form and the columns of the table/entity are non-transitively
dependent on the primary key.
Let‟s understand non-transitive dependency, with the help of the following example.

Gopabandhu kamila – MCA(Comp. Sc) ,UNIPOWER,BDK - 6290829720


B C A 4th SEM DATABASE ENGINEERING

Say a table named, Customer has the below columns:


CustomerID – Primary Key identifying a unique customer
CustomerZIP – ZIP Code of the locality customer resides in
CustomerCity – City the customer resides in
In the above case, the CustomerCity column is dependent on the CustomerZIP
column and the CustomerZIP column is dependent on CustomerID.
The above scenario is called transitive dependency of the CustomerCity column on
the CustomerID i.e. the primary key. After understanding transitive dependency, now
let‟s discuss the problem with this dependency.
There could be a possible scenario where an unwanted update is made to the table
for updating the CustomerZIP to a zipcode of a different city without updating the
CustomerCity, thereby leaving the database in an inconsistent state.
In order to fix this issue, we need to remove the transitive dependency that could be
done by creating another table, say, CustZIP table that holds two columns i.e.
CustomerZIP (as Primary Key) and CustomerCity.
The CustomerZIP column in the Customer table is a foreign key to the CustomerZIP
in the CustZIP table. This relationship ensures that there is no anomaly in the
updates wherein a CustomerZIP is updated without making changes to the
CustomerCity.
#4) Boyce-Codd Normal Form (3.5 Normal Form)
By definition, the table is considered Boyce-Codd Normal Form, if it‟s already in the
Third Normal Form and for every functional dependency between A and B, A should
be a super key.
This definition sounds a bit complicated. Let’s try to break it to understand it
better.
 Functional Dependency: The attributes or columns of a table are said to be
functionally dependent when an attribute or column of a table uniquely
identifies another attribute(s) or column(s) of the same table.
For Example, the empNum or Employee Number column uniquely identifies
the other columns like Employee Name, Employee Salary, etc. in the
Employee table.
 Super Key: A single key or group of multiple keys that could uniquely identify
a single row in a table can be termed as Super Key. In general terms, we
know such keys as Composite Keys.
Let‟s consider the following scenario to understand when there is a problem with
Third Normal Form and how does Boyce-Codd Normal Form comes to rescue.
empNum firstName empCity deptName deptHead

1001 Jack New York Accounts Raymond

1001 Jack New York Technology Donald

1002 Harry Berlin Accounts Samara

1007 Parker London HR Elizabeth

1007 Parker London Infrastructure Tom

Gopabandhu kamila – MCA(Comp. Sc) ,UNIPOWER,BDK - 6290829720


B C A 4th SEM DATABASE ENGINEERING

In the above example, employees with empNum 1001 and 1007 work in two different
departments. Each department has a department head. There can be multiple
department heads for each department. Like for the Accounts department, Raymond
and Samara are the two heads of departments.
In this case, empNum and deptName are super keys, which implies that deptName
is a prime attribute. Based on these two columns, we can identify every single row
uniquely.
Also, the deptName depends on deptHead, which implies that deptHead is a non-
prime attribute. This criterion disqualifies the table from being part of BCNF.
To solve this we will break the table into three different tables as mentioned below:
Employees Table:
empNum firstName empCity deptNum

1001 Jack New York D1

1001 Jack New York D2

1002 Harry Berlin D1

1007 Parker London D3

1007 Parker London D4


Department Table:
deptNum deptName deptHead

D1 Accounts Raymond

D2 Technology Donald

D1 Accounts Samara

D3 HR Elizabeth

D4 Infrastructure Tom
#5) Fourth Normal Form (4 Normal Form)
By definition, a table is in Fourth Normal Form, if it does not have two or more,
independent data describing the relevant entity.
#6) Fifth Normal Form (5 Normal Form)
A table can be considered in Fifth Normal Form only if it satisfies the conditions for
Fourth Normal Form and can be broken down into multiple tables without loss of any
data.
Questions And Answers
Q #1) What is Normalization in a Database?
Answer: Database Normalization is a design technique. Using this we can design or
re-design schemas in the database to reduce redundant data and the dependency of
data by breaking the data into smaller and more relevant tables.
Q #2) What are the different types of Normalization?

Gopabandhu kamila – MCA(Comp. Sc) ,UNIPOWER,BDK - 6290829720


B C A 4th SEM DATABASE ENGINEERING

Answer: Following are the different types of normalization techniques that can be
employed to design database schemas:
 First Normal Form (1NF)
 Second Normal Form (2NF)
 Third Normal Form (3NF)
 Boyce-Codd Normal Form (3.5NF)
 Fourth Normal Form (4NF)
 Fifth Normal Form (5NF)
Q #3) What is the Purpose of Normalization?
Answer: The primary purpose of the normalization is to reduce the data redundancy
i.e. the data should only be stored once. This is to avoid any data anomalies that
could arise when we attempt to store the same data in two different tables, but
changes are applied only to one and not to the other.
Q #4) What is Denormalization?
Answer: Denormalization is a technique to increase the performance of the
database. This technique adds redundant data to the database, contrary to the
normalized database that removes the redundancy of the data.
This is done in huge databases where executing a JOIN to get data from multiple
tables is an expensive affair. Thus, redundant data are stored in multiple tables to
avoid JOIN operations.

Gopabandhu kamila – MCA(Comp. Sc) ,UNIPOWER,BDK - 6290829720

You might also like