0% found this document useful (0 votes)
9 views

Database Techniques DB Normalization

Data base

Uploaded by

dontric360
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Database Techniques DB Normalization

Data base

Uploaded by

dontric360
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 37

Database Techniques

Database Normalization
What is Normalization?

• Is a database design technique that reduces


data redundancy and eliminates undesirable
element like Insert, Update and Delete
Anomalies (problem).
• It rules divides larger tables into smaller tables
and links them using relationships.
• Normalisation in SQL ensure data is stored
logically.
•Edgar Codd the inventor of relational
model proposed normalization of
data with the introduction of the First
Normal Form, and extend to Second
and Third Normal Form.
•Later him and Raymond F. Boyce join
to develop the theory of Boyce-Codd
Normal Form.
Database Normal Forms
List of Normal Forms in SQL:
1. 1NF (First Normal Form)
2. 2NF (Second Normal Form)
3. 3NF (Third Normal Form)
4. BCNF (Boyce-Codd Normal Form)
5. 4NF (Fourth Normal Form)
6. 5NF (Fifth Normal Form)
7. 6NF (Sixth Normal Form)
In most practical
applications,
normalization
achieves its best in
3 Normal Form.
rd
Example of Database Normalization
Supposing, a video library
maintains a database of movies
rented out. Without any
normalization or standardization
in database, all information is
stored in one table as shown
below.
In table above, Movies Rented column
has multiple values.
1st Normal Forms:
First Normal Form Rules says:
•Each table cell should contain a single
value.
•Each record needs to be unique.
Example of First Normal Form in DBMS
• As we learnt earlier, an SQL KEY is a single column
(Primary) or combination of multiple columns
(Composite) used to uniquely identify rows or tuples or
record in the table.
• Key is used to identify duplicate information,
• helps establish a relationship between multiple tables
in the database.

Note: Columns in a table that are NOT used to identify a


record uniquely are called non-key columns.
In the table of our
database, we have two
people with the same
name Robert Phil, but
they live in different
places (Address).
In the above table, we need both Full Name and
Address to identify a record uniquely. That is a
composite key.
Second Normal Form Rules Says
• Rule 1- Be in 1NF
• Rule 2- Single Column Primary Key that does not
functionally dependant on any subset of candidate key
relation
• It is clear that we can’t move forward to make our
database in 2nd Normalization form unless we partition the
table above.
• In the above, we have divided our 1NF table into two Table 1 and
Table2. Table 1 contains member information. Table 2 contains
information on movies rented.
• In table 1 above, we introduced a new column called Membership_id
a primary key for table 1. Records can be uniquely identified in Table 1
using membership id
• In Table 2, Membership_ID is the Foreign Key
Foreign Key in DBMS

Foreign Key references the primary key of


another Table! It helps connect Tables
It ensures rows in one table have
corresponding rows in another
Unlike the Primary key, they do not have
to be unique. Most often they aren’t
Foreign keys can be null even though
primary keys can not
Why do you need a foreign key?

Suppose, a novice inserts a record in Table1 such as:


• You will only be able to insert values into
your foreign key that exist in the unique key
in the parent table. This helps in referential
integrity.
• The above problem can be overcome by
declaring membership id from Table2 as
foreign key of membership id from Table1
• Now, if somebody tries to insert a value in
the membership id field that does not exist
in the parent table, an error will be shown!
Transitive functional dependencies

• A transitive functional dependency is when


changing a non-key column, might cause
any of the other non-key columns to change
• Consider the table 1. Changing the non-key
column Full Name may change Salutation.
Third Normal Form Rules Says:

• Rule 1- Be in 2NF
• Rule 2- Has no transitive functional
dependencies
• To move our 2NF table into 3NF, we need to
divide our table again.
3NF Example
• Below is a 3NF example in SQL database:
In above illustration, We divided our
tables and created a new table which
stores Salutations.
There are no transitive functional
dependencies, and hence our table is in
3NF
In Table 3 Salutation ID is primary key,
and in Table 1 Salutation ID is foreign to
primary key in Table 3
Our example at a level that cannot further
be decomposed to attain higher normal
form in DBMS. In fact, it is already in higher
normalization forms.
Separate efforts for moving into next levels
of normalizing data are normally needed in
complex databases.
Let discuss next levels in brief as follows:
• BCNF (Boyce-Codd Normal Form) also referred
to as 3.5 Normal Form
•Even when a database is in 3rd Normal
Form, still there would be anomalies if it
has more than one Candidate Key.
a table.
• 4NF (Fourth Normal Form) Rules
• If no table contains two or more, independent and
multivalued data describing the relevant entity, then
it is in 4th Normal Form.
• 5NF (Fifth Normal Form) Rules
• A table is in 5th Normal Form only if it is in 4NF
and it cannot be decomposed into any
number of smaller tables without loss of data.
• 6NF (Sixth Normal Form) Proposed
• 6th Normal Form is not standardized, it is being
discussed by database experts. Hopefully, we
would have a clear & standardized definition
for 6th Normal Form in the near future…
Denormalization
• Is a strategy used on a previously normalized
database to increase performance.
• Is the process of trying to improve the read (select)
performance of a database, at the expense of losing
some write (insert) performance, by adding redundant
copies of data.
• It is motivated by performance or scalability in
relational database when very large numbers of read
operations is necessary
Denormalization differs
from unnormalized form.
It benefits can only be
fully realized on a data
model that is already
normalized.
Implementation
A normalized design will often store different
but related information in separate logical tables
(called relations).
If these relations are stored as separate files,
completing a database query that draws
information from several relations (a join
operation such as: Left joint, right joint, inner joint or
outer joint)can be slow.
If many relations are joined, it may be too slow.
There are two strategies for dealing with this.
DBMS support

By keeping the logical design normalized, but allow


the DBMS to store additional redundant data on disk to
optimize query response.
DBMS must ensure that any redundant copies are kept
consistent.
This method is often implemented in SQL as indexed
views (Microsoft SQL Server) or materialised
view (Oracle, PostgreSQL).
A view represent information in a format convenient for
querying, and the index ensures that queries against the
view are optimized physically.
•The DBMS allows you to create
indices on views. Such views are
called indexed or materialized views.
•When a unique clustered index is
created on a view, the view is
executed and the result set is stored
in the database same way a table
with a clustered index is stored.
• Creating an indexed view is a two-step process:
1. Create the view using the CREATE VIEW statement with the
SCHEMABINDING clause.
2. Create the corresponding clustered index.

Example: Assuming you have an employee table that


is very large. The first step, is to create a typical view
that can be indexed to gain performance.

CREATE VIEW v_enterMonth WITH


SCHEMABINDING AS SELECT empNo,
DATEPART(MONTH, enterDate) AS enterMonth
FROM employee;
CREATE UNIQUE CLUSTERED
INDEX c_deptNo ON
v_enterMonth (enterMonth,
empNo);
DBA implementation

Another approach is to denormalize the logical


data design. this allows a similar improvement in
query response to be achieved
Database designer must ensure that the
denormalized database does not become
inconsistent.
This is done by creating rules in the database
called constraint (restrictions), that specify how
the redundant copies must be kept harmonize.
Denormalization versus not normalized data

• A denormalized model is not the same as model that


has not been normalized.
• denormalization should only take place after a
normalization has taken place and that any required
constraints and rules have been created to deal with
the inherent anomalies in the design.
• For example, all the relations are in third normal form
and any relations with multi-valued dependencies are
handled appropriately
Examples of denormalization techniques include:

• "Storing" the count of the "many" elements in a


one-to-many relationship as an attribute of the
"one" relation
• Adding attributes to a relation from another
relation with which it will be joined
• Star schema, also known as fact-dimension
models and have been extended to snowflake
schema
• Prebuilt summarization or OLAP cube
• With the increase in storage, processing
power and bandwidth, at all levels,
denormalization in databases has moved from
being unusual to a rule.
• One specific problem of denormalization is
that it "uses more storage" (more columns in a
database). With the exception of truly
enormous systems, this particular aspect has
been made irrelevant and using more storage
is a non-issue.

You might also like