0% found this document useful (0 votes)
161 views18 pages

Normalisation

The document discusses database normalization. Normalization is a process of organizing data to eliminate redundancy and inconsistencies. It involves decomposing tables and removing duplicated data across tables. The document outlines several normal forms including 1st, 2nd, 3rd normal forms and BCNF. It provides examples to illustrate issues like insertion, update and deletion anomalies that can occur without proper normalization.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
161 views18 pages

Normalisation

The document discusses database normalization. Normalization is a process of organizing data to eliminate redundancy and inconsistencies. It involves decomposing tables and removing duplicated data across tables. The document outlines several normal forms including 1st, 2nd, 3rd normal forms and BCNF. It provides examples to illustrate issues like insertion, update and deletion anomalies that can occur without proper normalization.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18

Normalisation

Normalization rules to check whether database is structurally correct and optimal.


Database Normalization is a technique of organizing the data in the database. Normalization is a systematic approach of
decomposing tables to eliminate data redundancy(repetition) and undesirable characteristics like Insertion, Update and
Deletion Anomalies. It is a multi-step process that puts data into tabular form, removing duplicated data from the
relation tables.
Normalization is used for mainly two purposes,
Eliminating redundant(useless) data.
Ensuring data dependencies make sense i.e data is logically stored.
Problems Without Normalization
If a table is not properly normalized and have data redundancy then it will not only eat up extra memory space but will
also make it difficult to handle and update the database, without facing data loss.
Insertion Anomaly
for a new admission, until and unless a student opts for a branch, data of the student cannot be inserted, or else we will
have to set the branch information as NULL.
Also, if we have to insert data of 100 students of same branch, then the branch information will be repeated for all those
100 students.
These scenarios are nothing but Insertion anomalies.
Updation Anomaly
Any body leave the college or is no longer the HOD of any department? In that case all the student records will have to
be updated, and if by mistake we miss any record, it will lead to data inconsistency. This is Updation anomaly.
Deletion Anomaly
In Student table, two different informations are kept together, Student information and
Branch information. Hence, at the end of the academic year, if student records are deleted, we
will also lose the branch information. This is Deletion anomaly.
Normalization:
First Normal Form
Second Normal Form
Third Normal Form
BCNF
Fourth Normal Form
First
• Normal Form (1NF)
For a table to be in the First Normal Form, it should follow the following 4 rules:
•It should only have single(atomic) valued attributes/columns.
•Values stored in a column should be of the same domain
•All the columns in a table should have unique names.
•And the order in which data is stored, does not matter.

Second Normal Form (2NF): A table is 2NF, if it is 1NF and every non-key column is fully dependent on the primary key.
•It should be in the First Normal form.
•And, it should not have Partial Dependency.
Third Normal Form (3NF): A table is 3NF, if it is 2NF and the non-key columns are independent of each others.
•It is in the Second Normal form.
•And, it doesn't have Transitive Dependency.

Boyce and Codd Normal Form is a higher version of the Third Normal form. This form deals with certain type of anomaly that
is not handled by 3NF. A 3NF table which does not have multiple overlapping candidate keys is said to be in BCNF.
•must be in 3rd Normal Form
•and, for each functional dependency ( X → Y ), X should be a super Key.
Fourth Normal Form (4NF)
A table is said to be in the Fourth Normal Form when,
•It is in the Boyce-Codd Normal Form.
•And, it doesn't have Multi-Valued Dependency.
Normalization is a process of organizing the data in database to avoid data redundancy, insertion anomaly, update
anomaly & deletion anomaly.

emp_id emp_name emp_address emp_dept

101 Rick Delhi D001

101 Rick Delhi D002

123 Maggie Agra D890

166 Glenn Chennai D900

166 Glenn Chennai D004


Update anomaly: In the above table we have two rows for employee Rick as he belongs
to two departments of the company. If we want to update the address of Rick then we have
to update the same in two rows or the data will become inconsistent. If somehow, the
correct address gets updated in one department but not in other then as per the database,
Rick would be having two different addresses, which is not correct and would lead to
inconsistent data.

Insert anomaly: Suppose a new employee joins the company, who is under training and
currently not assigned to any department then we would not be able to insert the data into
the table if emp_dept field doesn’t allow nulls.

Delete anomaly: Suppose, if at a point of time the company closes the department D890
then deleting the rows that are having emp_dept as D890 would also delete the
information of employee Maggie since she is assigned only to this department.
Normalization First normal form (1NF)
Here are the most commonly used normal forms: As per the rule of first normal form, an attribute (column) of a
table cannot hold multiple values. It should hold only atomic
 First normal form(1NF)
values.
 Second normal form(2NF)
Example:
 Third normal form(3NF)
 Boyce & Codd normal form (BCNF)

emp_id emp_name emp_address emp_mobile

101 Herschel New Delhi 8912312390

8812121212
102 Jon Kanpur
9900012222

103 Ron Chennai 7778881212

9990000123
104 Lester Bangalore
8123450987
Two employees (Jon & Lester) are having two mobile numbers so the company stored them in the same field as you
can see in the table above.
This table is not in 1NF as the rule says “each attribute of a table must have atomic (single) values”, the emp_mobile
values for employees Jon & Lester violates that rule.

To make the table complies with 1NF we should have the data like this:

emp_id emp_name emp_address emp_mobile

101 Herschel New Delhi 8912312390

102 Jon Kanpur 8812121212

102 Jon Kanpur 9900012222

103 Ron Chennai 7778881212

104 Lester Bangalore 9990000123

104 Lester Bangalore 8123450987


Second normal form (2NF)
A table is said to be in 2NF if both the following conditions hold:

 Table is in 1NF (First normal form)


 No non-prime attribute is dependent on the proper subset of any candidate key of table.
An attribute that is not part of any candidate key is known as non-prime attribute.

teacher_id subject teacher_age

111 Maths 38

111 Physics 38

222 Biology 38

333 Physics 40

333 Chemistry 40

Candidate Keys: {teacher_id, subject}


Non prime attribute: teacher_age
The table is in 1 NF because each attribute has atomic values. However, it is not in 2NF because non prime attribute
teacher_age is dependent on teacher_id alone which is a proper subset of candidate key. This violates the rule for 2NF
as the rule says “no non-prime attribute is dependent on the proper subset of any candidate key of the table”.

To make the table complies with 2NF we can break it in two tables like this:
teacher_id teacher_age teacher_id Subject
Third Normal form (3NF)
111 38 111 Maths
A table design is said to be in 3NF if both the following conditions
hold:
teacher_id Subject 111 Physics
 Table must be in 2NF

333 40 222 Biology


 Transitive functional dependency of non-prime attribute on any
super key should be removed.

333 Physics An attribute that is a part of one of the candidate keys is known
as prime attribute.
333 Chemistry

emp_id emp_name emp_zip emp_state emp_city emp_district

1001 John 282005 UP Agra Dayal Bagh

1002 Ajeet 222008 TN Chennai M-City

1006 Lora 282007 TN Chennai Urrapakkam

1101 Lilly 292008 UK Pauri Bhagwan

1201 Steve 222999 MP Gwalior Ratan


Super keys: {emp_id}, {emp_id, emp_name}, {emp_id, emp_name, emp_zip}…so on
Candidate Keys: {emp_id}
Non-prime attributes: all attributes except emp_id are non-prime as they are not part of any candidate keys.

Here, emp_state, emp_city & emp_district dependent on emp_zip. And, emp_zip is dependent on emp_id that makes
non-prime attributes (emp_state, emp_city & emp_district) transitively dependent on super key (emp_id). This violates
the rule of 3NF.

emp_na emp_zip emp_state emp_city emp_district


emp_id emp_zip
me
282005 UP Agra Dayal Bagh
1001 John 282005

222008 TN Chennai M-City


1002 Ajeet 222008

1006 Lora 282007 282007 TN Chennai Urrapakkam

1101 Lilly 292008 292008 UK Pauri Bhagwan

1201 Steve 222999 222999 MP Gwalior Ratan

Boyce Codd normal form (BCNF)


It is an advance version of 3NF that’s why it is also referred as 3.5NF. BCNF is stricter than 3NF. A table complies
with BCNF if it is in 3NF and for every functional dependency X->Y, X should be the super key of the table.

Example: Suppose there is a company wherein employees work in more than one department. They store the
data like this:
emp_nationali dept_ty dept_no_of_e Functional dependencies in the table above:
emp_id emp_dept
ty pe mp emp_id -> emp_nationality
Production and
1001 Austrian
planning
D001 200 emp_dept -> {dept_type, dept_no_of_emp}
1001 Austrian stores D001 250 Candidate key: {emp_id, emp_dept}
design and technical
1002 American
support
D134 100 The table is not in BCNF as neither emp_id nor emp_dept
1002 American
Purchasing
D134 600
alone are keys.
department
To make the table comply with BCNF we can break the
table in three tables like this:

emp_dept dept_type dept_no_of_emp

emp_i Production and


emp_nationality D001 200
d planning

1001 Austrian stores D001 250

1002 American design and technical


D134 100
support
Purchasing department D134 600

Functional dependencies:
emp_id -> emp_nationality
emp_id emp_dept emp_dept -> {dept_type, dept_no_of_emp}
1001 Production and planning
Candidate keys:
1001 stores
For first table: emp_id
1002 design and technical support
For second table: emp_dept
1002 Purchasing department
Fourth normal form (4NF) is a level of database normalization where there are no non-trivial multivalued dependencies
other than a candidate key. It builds on the first three normal forms (1NF, 2NF and 3NF) and the Boyce-Codd Normal
Form (BCNF).

Properties – A relation R is in 4NF if and only if the following conditions are satisfied:
It should be in the Boyce-Codd Normal Form (BCNF).
the table should not have any Multi-valued Dependency.
Example – Consider the database table of a class whaich has two relations R1 contains student ID(SID) and student
name (SNAME) and R2 contains course id(CID) and course name (CNAME).
Table – R1(SID, SNAME ) Table – R2(CID, CNAME)
 
SID SNAME CID CNAME
S1 A C1 C
S2 B C2 D

When there cross product is done it resulted in multivalued dependencies


Table – R1 X R2 Multivalued dependencies (MVD) are:
SID->->CID; SID->->CNAME; SNAME->->CNAME

Joint dependency – Join decomposition is a further generalization of Multivalued dependencies. If


SID SNAME CID CNAME the join of R1 and R2 over C is equal to relation R then we can say that a join tdependency (JD)
S1 A C1 C exists, where R1 and R2 are the decomposition R1(A, B, C) and R2(C, D) of a given relations R (A,
B, C, D).
S2 B C2 D

Example – Agent->->Company
Table – R1 Agent->->Product
Table – R2 Table – R3
Company->->Product

COMPANY PRODUCT AGENT COMPANY AGENT PRODUCT

C1 pendrive Aman C1 Aman pendrive


C1 mic Aman C2 Aman mic
C1 speaker Mohan C1 Aman speaker
C1 speaker Mohan speaker
Table – R1⋈R2⋈R3

COMPANY PRODUCT AGENT


C1 pendrive Aman
C1 mic Aman
C1 speaker Mohan
C2 speaker Aman
Assignment 1:
1. Create two tables Department and Employee having the following fields.
Set the corresponding field’s properties.
Department: Employee
Field name Properties Field name Properties
Dcode Text(10), Not Null, Primary keyEcode Text(10), Not Null, Primary key
(“10” or “20” or ”30”)
Dname Text (20), Required(yes) Ename Text(20), Required
Dlocation Text (30), Required (yes) Dcode Text(10), Required
Eage Number(long int.),Required,
Decimal(0),Validation rule (should be
>=18 and <55)
Validation text (“ check entered value”)
Eaddress Text(30), Required,
Esalary Number(10),Required, Validation (>7000
and <25000)
DoJ Date, Required, short date,
DA Number(10), Decimal (2)
HRA Number(10), Decimal(2)
Tax Number(10), Decimal(2)
Totamount Currency, Cap.(total Amount), Decimal(auto),
default(0)
2. Create relationship between two tables and insert 10 records to each table.
Assignment-2 (Select & Update Query)

1. Insert a new field ‘Designation’ in the employee table having the following properties - Text(10),
required ,Not Null, validation rule (“ Officer” or “Clerk” or “Manager” or “Supervisor” )
2. Enter value to ‘designation’ field in employee table .
3. Select all employees from employee table displaying their name , code, department name, depart location,
Basic and Date of joining.
4. Select employee name, Age, Doj whose name starts with ‘s’.
5. Select employees whose age greater than 45.
6. Select employees name ,basic, Department code and location whose has joined before the year 2000.
7. Select employees name, age , location and designation where designation =“Manager”.
8. Find the list of employees who do not belong to “Manager” or “Clerk”.

9. Update employee table as da=10%of basic, hra=15%of basic, ma=12%of basic and tax=20%of basic
10. Update employee table as totsalary=[basic]+[da]+[hra]+[ma]-[tax].
11. Select employees name , location , designation and totsalary where totsalary> 24000.
12. Perform the above query with the same criteria along with designation=“clerk” .
.
Assignment -3( Maketable , Append & Delete)

1. Create a table ‘duplicate Employee’ from ‘Employee’ using ‘make table query.’
2. Create a table ‘selective Employee’ from ‘Employee’ (all fields) using ‘make table query’ where
Ecode=“e004” or totsalary>=10000.
3. Create a new table ‘combined’ having the fields Ecode, Ename,Dcode, DoJ, Totsalary from
Employee and Department using ‘Make table query’.
4. Create a blank table ‘duplicate Department’ having same field name and data type of
‘Department’.
5. Insert records from ‘Department’ table to ‘duplicate Department’ table using append query
where Dcode>10 .
6. Delete records from the ‘selective Employee’ using delete query applying a suitable date
criteria.
7. Append records to ‘selective Employee’ table from ‘Employee’ where the Ecode >e004.
8. Delete records from ‘selective Department’ table where Dcode = ”30”.
9. Select records from Employee table where Employee name starts with ‘s’.
10.Select those Employees name, phone no. ,address, ecode and dcode whose totsalary <25000
or tax > 2000.

You might also like