0% found this document useful (0 votes)
18 views40 pages

DBMS Unit-Iii

Uploaded by

tejasvimuddasani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views40 pages

DBMS Unit-Iii

Uploaded by

tejasvimuddasani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

UNIT-3

OVERVIEW:
• The SQL language has several aspects to it.
• DML (Data Manipulation Language)
• DDL (Data Definition Language)
• Triggers and Advanced Integrity Constraints
• Embedded and Dynamic SQL
• SQL code to be called from a host language such as C or COBOL or JAVA
• Dynamic SQL (Query to be constructed (and executed) as run-time
• Client-Server Execution and Remote Database Access:
• These commands control how a client application program can connect to an SQL db
server; or access db over a network.
• Transaction Management (explicitly control aspects of how a transaction is to be executed)
• Security: mechanisms to control users access to data objects such as tables & views
• Advanced Features: OO features, recursive queries, DS queries, DM, spatial data etc.
THE FORM A BASIC SQL QUERY
• The basic form of an SQL query is as follows:
• SELECT [DISTINCT] select-list
• FROM from-list
• WHERE qualification
• Every query must have a SELECT clause, which specifies columns to be retained in the result, and a FROM
clause, which specifies a cross-product of tables.
• The optional WHERE Clause specifies selection conditions on the tables mentioned in the FROM clause.
(Q15) Find the names and ages of all sailors.
SELECT DISTINCT S.sname, S.age FROM Sailors S

(Q11) Find all sailors with a rating above 7.


SELECT S.sid, S.sname, S.rating, S.age FROM Sailors AS S WHERE S.rating > 7

SELECT clause used to do projection Whereas selections in the relational algebra sense are expressed using the WHERE
clause
The from-list in the FROM clause is a list of table names.
Table name can be followed by a range variable; (useful when same table name appears more than once)

The select-list is a list of column names of tables named in the from-list.


The qualification in the WHERE clause is a Boolean combination (using connectives AND, OR and NOT) of Form
expression op expression, where op is comparison operators. Expression is column name or constants or an expression

Conceptual Evaluation Strategy:


1. Compute the cross-product of the tables in the from-list
2. Delete rows in the cross-product that fail the qualification conditions.
3. Delete all columns that do not appear in the select-list.
4. If DISTINCT is specified, eliminate duplicate rows.

(Q1) Find the names of sailors who have reserved boat number 103.

SELECT S.sname FROM Sailors S, Reserves R WHERE S.sid = R.sid AND R.bid = 103

(Q16) Find the sids of sailors who have reserved a red boat.
SELECT R.sid FROM Boats B, Reserves R WHERE B.bid = R.bid AND B.color = ‘red’

(Q2) Find the names of sailors who have reserved a red boat
SELECT S.sname FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ‘red’

(Q3) Find the colors of boats reserved by Lubber.


SELECT B.color FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid and R.bid = B.bid AND S.sname =
‘Lubber’

(Q4) Find the names of sailors who have reserved at least one boat.
SELECT S.sname FROM Sailors S, Reserves R WHERE S.sid = R.sid

Expressions and Strings in the SELECT Command


SQL supports a more general version of the select-list than just a list of columns.

Each item in a select-list can be of the form expression AS column name, where expression is any arithmetic or string
expression over column names and constants, and column name is a new name for this column in the output of the query.

(Q17) Compute increments for the ratings of persons who have sailed two different boats on the same day

SELECT S.sname, S.rating+1 AS rating FROM Sailors S, Reserves R1, Reserves R2 WHERE S.sid = R1.sid AND S.sid
= R2.sid AND R1.day = R2.day AND R1.bid <> R2.bid

In addition, SQL provides support for pattern matching through the LIKE operator, along with the use of the wild-card
symbols %

Thus ‘AB%’ denotes a pattern matching every string that contains at least three characters, with the second and third
characters being A and B respectively.
(Q18) Find the ages of sailors whose name begins and ends with B and has at least three characters

SELECT S.age FROM Sailors SWHERE S.sname LIKE ‘B_%B’

UNION, INTERSECT, AND EXCEPT


SQL provides three set-manipulation constructs that extend the basic query form presented earlier.
UNION
INTERSECT
EXCEPT
IN (to check if an element is in a given set)
EXISTS(to check if a set is empty).
NOT

(Q5) Find the names of sailors who have reserved a red or a green boat.

SELECT S.sname FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid and R.bid = B.bid AND (B.color = ‘red’
OR B.color = ‘green’)

Or Q5 can be rewritten as follows:

SELECT S.sname FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ‘red’

UNION

SELECT S2.sname FROM Sailors S2, Boats B2, Reserves R2 WHERE S2.sid = R2.sid AND R2.bid = B2.bid AND
B2.color = ‘green’ AND B1.color = ‘red’ AND B2.color = ‘green’

(Q6) Find the names of sailors who have reserved both a red and a green boat

SELECT S.sname FROM Sailors S, Reserves R1, Boats B1, Reserves R2, Boats B2 where S.sid = R1.sid AND R1.bid =
B1.bid AND S.sid = R2.sid AND R2.bid = B2.bid

OR Q6 can be rewritten as follows:

SELECT S.sname FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ‘red’

INTERSECT

SELECT S2.sname FROM Sailors S2, Boats B2, Reserves R2 WHERE S2.sid = R2.bid AND R2.bid = B2.bid AND
B2.color = ‘green’

(Q19) Find the sids of all sailors who have reserved red boats but not green boats.

SELECT S.sid FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ‘red’
EXCEPT
SELECT S2.sid FROM Sailors S2, Reserves R2, Boats B2 WHERE S2.sid = R2.sid AND R2.bid = B2.bid AND B2.color
= ‘green’

OR

SELECT R.sid FROM Boats B, Reserves R WHERE R.bid = B.bid AND B.color = ‘red’
EXCEPT
SELECT R2.sid FROM Boats B2, Reserves R2 WHERE R2.bid = B2.bid AND B2.color = ‘green’
20. Find all sailors who have a rating of 10 or reserved boat 104.

SELECT S.sid FROM Sailors S WHERE S.rating = 10


UNION
SELECT R.sid FROM Reserves R WHERE R.bid = 104

NESTED QUERIES
A Nested Query is a query that has another query within it; the embedded query is called a subquery.

1. Find the names of sailors who have reserved boat 103.

SELECT S.sname FROM Sailors S WHERE S.sid IN ( SELECT R.sid FROM Reserves R WHERE R.bid = 103 )

2. Find the names of sailors who have reserved a red boat.

SELECT S.sname FROM Sailors S WHERE S.sid IN ( SELECT R.sid FROM Reserves R
WHERE R.bid IN ( SELECT B.bid FROM Boats B WHERE B.color = `red' )

(Q21) Find the names of sailors who have not reserved a red boat.

SELECT S.sname FROM Sailors S WHERE S.sid NOT IN ( SELECT R.sid


FROM Reserves R WHERE R.bid IN ( SELECT B.bid FROM Boats B WHERE B.color = `red' )

(Q1) Find the names of sailors who have reserved boat number 103.

SELECT S.sname FROM Sailors S WHERE EXISTS ( SELECT * FROM Reserves R


WHERE R.bid = 103 AND R.sid = S.sid )

(Q22) Find sailors whose rating is better than some sailor called Horatio.

SELECT S.sid FROM Sailors S WHERE S.rating > ANY ( SELECT S2.rating
FROM Sailors S2 WHERE S2.sname = `Horatio' )

(Q24) Find the sailors with the highest rating.

SELECT S.sid FROM Sailors S WHERE S.rating >= ALL (SELECT S2.rating FROM Sailors S2 )

(Q6) Find the names of sailors who have reserved both a red and a green boat.

SELECT S.sname FROM Sailors S, Reserves R, Boats B


WHERE S.sid = R.sid AND R.bid = R.bid AND B.color = ‘red’
AND S.sid IN (SELECT S2.sid
FROM Sailors S2, Boats B2, Reserves R2
WHERE S2.sid = R2.sid AND R2.bid = B2.bid
AND B2.color = ‘green’)
(Q9) Find the names of sailors who have reserved all boats.
SELECT S.sname FROM Sailors S
WHERE NOT EXISTS (( SELECT B.bid FROM Boats B )
EXCEPT
(SELECT R.bid FROM Reserves R WHERE R.sid = S.sid ))
AGGREGATE OPERATIONS
1. COUNT ([DISTINCT] A): The number of (unique) values in the A column.
2. SUM ([DISTINCT] A): The sum of all (unique) values in the A column.
3. AVG ([DISTINCT] A): The average of all (unique) values in the A column.
4. MAX (A): The maximum value in the A column.
5. MIN (A): The minimum value in the A column.

(Q25) Find the average age of all sailors.

SELECT AVG (S.age) FROM Sailors S.

(Q26) Find the average age of sailors with a rating of 10.

SELECT AVG (S.age) FROM Sailors S WHERE S.rating = 10

(Q27) Find the name and age of the oldest sailor.

SELECT S.sname, MAX (S.age) FROM Sailors S

(Q28) Count the number of sailors.

SELECT COUNT(*) FROM Sailors S

(Q29) Count the number of different sailor names.

SELECT COUNT ( DISTINCT S.sname ) FROM Sailors S

(Q30) Find the names of sailors who are older than the oldest sailor with a rating of 10.

SELECT S.sname FROM Sailors S


WHERE S.age > (SELECT MAX (S2.age) FROM Sailors S2 WHERE S2.rating = 10 )

The GROUP BY and HAVING Clauses


(Q31) Find the names of the youngest sailor for each rating level.
If we know the that ratings are integers in the range 1 to 10, we could write 10 queries of the form:

SELECT MIN (S.age) FROM Sailors S WHERE S.rating = i;

Where I = 1,2….10. Writing 10 such queries is tedious. More important, we may not know what rating levels exists in
advance.

To write such queries, we need a major extension to the basic SQL query form, namely, the GROUP BY clause.

The extension also includes an option HAVING clause that can be used to specify qualifications over groups.

The general form of an SQL query with these extensions is:

SELECT [ DISTINCT ] select-list


FROM from-list
WHERE qualification
GROUP BY grouping-list
HAVING group-qualification
Using the GROUP BY clause, we can write Q31 as follows:

SELECT S.rating, MIN (S.age) FROM Sailors S GROUP BY S.rating

(Q32) Find the age of the youngest sailor who is eligible to vote (i.e., is at least 18 years old) for each rating level with at least two
such sailors
SELECT S.rating, MIN (S.age) AS minage FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating
HAVING COUNT (*) > 1

(Q33) For each red boat, find the number of reservations for this boat
SELECT B.bid, COUNT (*) AS sailorcount FROM Boats B, Reserves R WHERE R.bid = B.bid AND B.color = `red'
GROUP BY B.bid

(Q34) Find the average age of sailors for each rating level that has at least two sailors
SELECT S.rating, AVG (S.age) AS avgage FROM Sailors S GROUP BY S.rating HAVING COUNT (*) > 1

(Q35) Find the average age of sailors who are of voting age (i.e., at least 18 years old)
for each rating level that has at least two sailors.

(Q36) Find the average age of sailors who are of voting age (i.e., at least 18 years old) for each rating level that has at least two such
sailors
SELECT S.rating, AVG ( S.age ) AS avgage FROM Sailors S WHERE S. age > 18 GROUP BY S.rating
HAVING 1 < ( SELECT COUNT (*) FROM Sailors S2 WHERE S.rating = S2.rating AND S2.age >= 18 )

(Q37) Find those ratings for which the average age of sailors is the minimum overall Ratings
SELECT S.rating FROM Sailors S WHERE AVG (S.age) = ( SELECT MIN (AVG (S2.age)) FROM Sailors S2
GROUP BY S2.rating )

COMPLEX INTEGRITY CONSTRAINTS IN SQL


Constraints over a Single Table

We can specify complex constraints over a single table using table constraints, which have the form CHECK conditional-expression.
For example, to ensure that rating must be an integer in the range 1 to 10, we could use:

CREATE TABLE Sailors ( sid INTEGER,


sname CHAR(10),
rating INTEGER,
age REAL,
PRIMARY KEY (sid),
CHECK ( rating >= 1 AND rating <= 10 ))

To enforce the constraint that Interlake boats cannot be reserved, we could use:
CREATE TABLE Reserves ( sid INTEGER,
bid INTEGER,
day DATE,
FOREIGN KEY (sid) REFERENCES Sailors
FOREIGN KEY (bid) REFERENCES Boats
CONSTRAINT noInterlakeRes
CHECK ( `Interlake' <>
( SELECT B.bname
FROM Boats B
WHERE B.bid = Reserves.bid )))
Domain Constraints and Distinct Types
A user can de_ne a new domain using the CREATE DOMAIN statement, which makes uses of CHECK constraints.

CREATE DOMAIN ratingval INTEGER DEFAULT 0 CHECK ( VALUE >= 1 AND VALUE <= 10 )

INTEGER is the base type for the domain ratingval, and every ratingval value must be of this type. Values in ratingval are further
restricted by using a CHECK constraint; in de_ning this constraint, we use the keyword VALUE to refer to a value
in the domain.

The optional DEFAULT keyword is used to associate a default value with a domain. If the domain ratingval is used for a column in
some relation, and no value is entered for this column in an inserted tuple, the default value 0 associated with ratingval is used.

Assertions: ICs over Several Tables.

Table constraints are associated with a single table, although the conditional expression in the CHECK clause can refer to other tables.
Table constraints are required to hold only if the associated table is nonempty. Thus, when a constraint involves two or more tables,
the table constraint mechanism is sometimes cumbersome and not quite what is desired. To cover such situations, SQL supports the
creation of assertions, which are constraints not associated with any one table.

As an example, suppose that we wish to enforce the constraint that the number of boats plus the number of sailors should be less than
100.
CREATE TABLE Sailors ( sid INTEGER,
sname CHAR(10),
rating INTEGER,
age REAL,
PRIMARY KEY (sid),
CHECK ( rating >= 1 AND rating <= 10)
CHECK ( ( SELECT COUNT (S.sid) FROM Sailors S )
+ ( SELECT COUNT (B.bid) FROM Boats B )
< 100 ))

TRIGGERS AND ACTIVE DATABASES


A trigger is a procedure that is automatically invoked by the DBMS in response to specified changes to the database, and is typically
specified by the DBA. A databasethat has a set of associated triggers is called an active database. A trigger description contains three
parts:
Event: A change to the database that activates the trigger.
Condition: A query or test that is run when the trigger is activated.
Action: A procedure that is executed when the trigger is activated and its condition is true.

A trigger can be thought of as a `daemon' that monitors a database, and is executed when the database is modified in a way that
matches the event specification. An insert, delete or update statement could activate a trigger, regardless of which user or application
invoked the activating statement; users may not even be aware that a trigger was executed as a side effect of their program.

Examples of Triggers in SQL


DESIGNING ACTIVE DATABASES
Triggers offer a powerful mechanism for dealing with changes to a database, but they must be used with caution. The effect of a
collection of triggers can be very complex and maintaining an active database can become very difficult. Often, a judicious use of
integrity constraints can replace the use of triggers.

In an active database system, when the DBMS is about to execute a statement that modifies the database, it checks whether some
trigger is activated by the statement. If so, the DBMS processes the trigger by evaluating its condition part, and then (if the Condition
evaluates to true) executing its action part.

If a statement activates more than one trigger, the DBMS typically processes all of them, in some arbitrary order. An important point
is that the execution of the action part of a trigger could in turn activate another trigger. In particular, the execution of the action part
of a trigger could again activate the same trigger; such triggers are called
recursive triggers. The potential for such chain activations, and the unpredictable order in which a DBMS processes activated
triggers, can make it difficult to understand the effect of a collection of triggers.
SCHEMA REFINEMENT AND NORMAL FORMS

INTRODUCTION TO SCHEMA REFINEMENT


Problems Caused by Redundancy
Storing the same information redundantly, that is, in more than one place within a database, can lead to several problems.

● Redundant Storage
● Update Anomalies
● Insertion Anomalies
● Deletion Anomalies

Decompositions
The essential idea is that many problems arising from redundancy can be addressed by replacing a relation with a
collection of ‘smaller’ relations.

A decomposition of a relation schema R consists of replacing the relation schema by two (or more) relation schemas
that each contain a subset of the attributes of R and together include all attributes in R. Institutively, we want to store the
information in any given instance of R by storing projections of the instance.

We can decompose Hourly_Emps into two relations:

Hourly Emps2(ssn, name, lot, rating, hours worked)


Wages(rating, hourly wages)
Problems Related to Decomposition
Two important questions must be asked repeatedly:

1. Do we need to decompose a relation?


2. What problems (if any) does a given decomposition cause?

To help with the first question, several normal forms have been proposed for relations. If a relation schema is in one of
these normal forms, we know that certain kinds of problems cannot arise.

W.r.t the second question, two properties of decompositions are of particular interest. The lossless-join property enables us
to recover any instance of the decomposed relation from corresponding instances of the smaller relations. The dependency-
preservation property enables us to enforce any constraint on the original relation by simply enforcing some constraints on
each of the smaller relations.

Functional Dependencies
A functional dependency (FD) is a kind of IC that generalizes the concept of a key.
Let R be a relation schema and let X and Y be nonempty sets of attributes in R. We say that an instance r of R satis_es the FD X ! Y 1 if
the following holds for every pair of tuples t1 and t2 in r:

If t1:X = t2:X, then t1:Y = t2:Y .

We use the notation t1:X to refer to the projection of tuple t1 onto the attributes in X, in a natural extension of our TRC notation (see
Chapter 4) t:a for referring to attribute a of tuple t. An FD X->Y essentially says that if two tuples agree on the values in attributes X,
they must also agree on the values in attributes Y.

Figure 15.3 illustrates the meaning of the FD AB -> C by showing an instance that satisfies this dependency. The first two tuples show
that an FD is not the same as a key constraint: Although the FD is not violated, AB is clearly not a key for the
relation. The third and fourth tuples illustrate that if two tuples differ in either the A field or the B field, they can differ in the C field
without violating the FD. On the other hand, if we add a tuple ha1; b1; c2; d1i to the instance shown in this figure, the resulting
instance would violate the FD; to see this violation, compare the first tuple
in the figure with the new tuple

What is Relation?
A relation is a named two–dimensional table of data. Each relation consists of a set of named columns and an arbitrary
number of unnamed rows.
For example, a relation named Employee contains following attributes, emp-id, ename, dept name and salary.
marketing 42000

What are the Properties of relations?


The properties of relations are defined on two dimensional tables. They are:
θ Each relation (or table) in a database has a unique name.
θ An entry at the intersection of each row and column is atomic or single. These can be no multiplied atttributes in a
relation.
θ Each row is unique, no two rows in a relation are identical.
θ Each attribute or column within a table has a unique name.
θ The sequence of columns (left to right) is insignificant the column of a relation can be interchanged without changing
the meaning use of the relation.
θ The sequence of rows (top to bottom) is insignificant. As with column, the rows of relation may be interchanged or
stored in any sequence.

A functional dependency is a constraint between two attributes (or) two sets of attributes.
For example, the table EMPLOYEE has 4 columns that are Functionally dependencies on EMP_ID.

Partial functional dependency: It is a functional dependency in which one or more nonkey attributes are functionally
dependent on part of the primary key. Consider the following graphical representation, in that some of the attributes are
partially depend on primary key

In this example, Ename, Dept_name, and salary are fully functionally depend on Primary key of Emp_id. But Course_title
and date_completed are partial functional dependency. In this case, the partial functional dependency creates redundancy
in that relation

What is Normal Form? What are steps in Normal Form?

NORMALIZATION: Normalization is the process of decomposing relations to produce smaller, well-structured relation.
To produce smaller and well structured relations, the user needs to follow six normal forms

Steps in Normalization:
A normal form is state of relation that result from applying simple rules from regarding functional dependencies
relationships between attributes to that relation. The normal form are
1. First normal form
2. Second normal form
3. Third normal form
4. Boyce/codd normal form
5. Fourth normal form
6. Fifth normal form

1) First Normal Form: Any multi-valued attributes (also called repeating groups) have been removed,
2) Second Normal Form: Any partial functional dependencies have been removed.
3) Third Normal Form: Any transitive dependencies have been removed.
4) Boyce/Codd Normal Form: Any remaining anomalies that result from functional dependencies have been removed.
5) Fourth Normal Form: Any multi-valued dependencies have been removed.
6) Fifth Normal Form: Any remaining anomalies have been removed.

Advantages of Normalized Relations Over the Un-normalized Relations:


The advantages of normalized relations over un-normalized relations are
1) Normalized relation (table) does not contain repeating groups whereas, unnormalized relation (table) contains one or
more repeating groups.
2) Normalized relation consists of a primary key. There is no primary key presents in un-normalized relation.
3) Normalization removes the repeating group which occurs many times in a table.
4) With the help of normalization process, we can transform un-normalized table to First Normal Form (1NF) by
removing repeating groups from un-normalized tables.
5) Normalized relations (tables) gives the more simplified result whereas unnormalized relation gives more complicated
results.
6) Normalized relations improve storage efficiency, data integrity and scalability. But un-normalized relations cannot
improvise the storage efficiency and data integrity.
7) Normalization results in database consistency, flexible data accesses.

FIRST NORMAL FORM (1NF):


A relation is in first normal form (1NF) contains no multi-Valued attributes. Consider the example employee, that contain
multi valued attributes that are removing and converting into single valued attributes
Multi valued attributes in course title

Removing the multi valued attributes and converting single valued using First NF

SECOND NORMAL FORM (2NF):


A relation in Second Normal Form (2NF) if it is in the 1NF and if all non-key attributes are fully functionally dependent
on the primary key. In a functional dependency X -> Y, the attribute on left hand side ( i.e. x) is the primary key of the
relation and right side attributes on right hand side i.e. Y is the non-key attributes. In some situation some non-key
attributes are partial functional dependency on primary key. Consider the following example for partial functional
specification and also that convert into 2 NF to decompose that into two relations.

To avoid this, convert this into Second Normal Form. The 2NF will decompose the relation into two relations, shown in
graphical representation
In the above graphical representation
¬ the EMPLOYEE relation satisfies rule of 1 NF in Second Normal form and
¬ the COURSE relation satisfies rule of 2 NF by decomposing into two relation

THIRD NORMAL FORM(3NF): A relation that is in Second Normal form and has no transitive dependencies present.

Transitive dependency: A transitive is a functional dependency between two non-key attributes. For example, consider
the relation Sales with attributes cust_id, name, sales person and region that shown in graphical representation.

CUST_ID NAME SALESPERSON REGION


1001 Anand Smith South
1002 Sunil kiran West
1003 Govind babu rao East
1004 Manohar Smith South
1005 Madhu Somu North

In this example, to insert, delete and update any row that facing Anomaly

a) Insertion Anomaly: A new salesperson is assigned to North Region without assign a customer to that salesperson. This
causes insertion Anomaly.

b) Deletion Anomaly: If a customer number say 1003 is deleted from the table, we lose the information of salesperson
who is assigned to that customer. This causes, Deletion Anomaly.

c) Modification Anomaly: If salesperson Smith is reassigned to the East region, several rows must be changed to reflect
that fact. This causes, update anomaly.

To avoid this Anomaly problem, the transitive dependency can be removed by decomposition of SALES into two
relations in 3NF.

Consider the following example, that removes Anomaly by decomposing into two relations

CUST_ID NAME SALESPERSON


1001 Anand Smith
1002 Sunil Kiran
1003 Govind Babu rao
1004 Manohar Smith
1005 Madhu Somu

SalesPerson Region
Smith South
Kiran West
Babu Rao East
Smith South
Somu North
BOYCE/CODD NORMAL FORM(BCNF): A relation is in BCNF if it is in 3NF and every determinant is a candidate
key.

FD in F+ of the form X -> A where X с S and A є S, X is a super key of R.

Boyce-Codd normal form removes the remaining anomalies in 3NF that are resulting from functional dependency, we can
get the result of relation in BCNF.
For example, STUDENT-ADVIDSOR IN 3NF

In the above relation the primary key in student-id and major-subject. Here the part of the primary key major-subject is
dependent upon a non-key attribute faculty–advisor. So, here the determinant the faculty-advisor. But it is not candidate
key.

Here in this example there are no partial dependencies and transitive dependencies. There is only functional
dependency between part of the primary key and non key attribute. Because of this dependency there is anomaly in this
relation. Suppose that in maths subject the advisor’ B’ is replaced by X. this change must be made in two or more rows in
this relation. This is an updation anomaly.

To convert a relation to BCNF the first step in the original relation is modified that the determinant (non key attributes)
becomes a component of the primary key of new relation. The attribute that is dependent on determinant becomes a non-
key attributes.
The second step in the conversion process is decompose the relation to eliminate the partial functional dependency.

This results in two relations. These relations are in 3NF and BCNF. since there is only one candidate key. That is
determinant.

Two relations are in BCNF.

In these two relations the student relation has a composite key, which contains attributes student-id and faculty-advisor.
Here faculty–advisor a foreign key which is referenced to the primary key of the advisor relation.

Two relations are in BCNF with simple data

Fourth Normal Form (4 NF):

A relation is in BCNF that contain no multivalued dependency. In this case, 1 NF will repeated in this step. For example,
R be a relation schema, X and Y be attributes of R, and F be a set of dependencies that includes both FDs and MVDs. (i.e.
Functional Dependency and Multi-valued Dependencies). Then R is said to be in Fourth Normal Form (4NF) if for every
MVD X ->-> Y that holds over R, one of the following statements is true.

1) Y с X or XY = R, or 2) X is a super key

Example: Consider a relation schema ABCD and suppose that are FD A -> BCD and the MVD B -> -> C are given as
shown in Table
It shows three tuples from relation ABCD that satisfies the given MVD B -> -> C. From the definition of a MVD given
tuples t1 and t2, it follows that tuples t3 must also be included in the above relation. Now, consider tuples t2 and t3. From
the given FD A -> BCD and the fact that these tuples have the same A-value, we can compute

the c1 = c2. Therefore, we see that the FD B -> C must hold over ABCD whenever the FD A-> BCD and the MVD
B-> -> C holds. If B -> C holds, the relation is not in BCNF but the relation is in 4 NF.

The fourth normal from is useful because it overcomes the problems of the various approaches in which it represents the
multi-valued attributes in a single relation.

Fifth Normal Form (5 NF): Any remaining anomalies from 4 NF relation have been removed.

A relation schema R is said to be in Fifth Normal Form (5NF) if, for every join dependency
* (R1, . . . . Rn) that holds over R, one of the following statements is true.

*Ri = R for some I, or

* The JD is implied by the set of those FDs over R in which the left side is a key for R. It deals with a property loss less
joins

LOSSELESS-JOIN DECOMPOSITION:

Let R be a relation schema and let F be a set FDs (Functional Dependencies) over R. A decomposition of R into two schemas
with attribute X and Y is said to be lossless-join decomposition with respect to F, if for every instance r of R that satisfies
the dependencies in Fr.

In simple words, we can recover the original relation from the decomposed relations

In general, if we take projection of a relation and recombine them using natural join, we obtain some additional tuples that
were not in the original relation
The decomposition of relation schema r i.e. SPD into SP i.e. PROJECTING πsp (r ) and PD i.e., projecting πPD (r)
is therefore lossless decomposition as it gains back all original tuples of relation ‘r’ as well as with some additional tuples
that were not in original relation ‘r’
UNION Operation : UNION is used to combine the results of two or more SELECT statements.
However it will eliminate duplicate rows from its resultset. In case of union, number of columns
and datatype must be same in both the tables, on which UNION operation is being applied.

Example of UNION
The First table, The Second table,
ID Name ID Name
1 abhi 2 adam
2 adam 3 Chester

Union SQL query will be,


 SELECT * FROM First UNION SELECT * FROM Second;
The result set table will look like,
ID NAME
1 abhi
2 adam
3 Chester

UNION ALL: This operation is similar to Union. But it also shows the duplicate rows.

Example of Union All


The First table, The Second table,
ID NAME ID NAME
1 abhi 2 adam
2 adam 3 Chester
Union All query will be like,
SELECT * FROM First UNION ALL SELECT * FROM Second;The resultset table will look like,
ID NAME

1 abhi

2 adam

2 adam

3 Chester
INTERSECT:
Intersect operation is used to combine two SELECT statements, but it only retuns the records which
are common from both SELECT statements. In case of Intersect the number of columns and
datatype must be same.
NOTE: MySQL does not support INTERSECT operator.

Example of Intersect
The First table, The Second table,
ID NAME ID NAME
1 abhi 2 adam
2 adam 3 Chester
Intersect query will be,
SELECT * FROM First INTERSECT SELECT * FROM Second;The result set table will look like
ID NAME
2 adam

MINUS:
The Minus operation combines results of two SELECT statements and return only those in the final
result, which belongs to the first set of the result.

Example of Minus

The First table, The Second table,


ID NAME ID NAME
1 abhi 2 adam
2 adam 3 Chester
Minus query will be,
SELECT * FROM First MINUS SELECT * FROM Second;
The resultset table will look like,
ID NAME
1 abhi

NESTED QUERIES
A Subquery or Inner query or a Nested query is a query within another SQL query and embedded
within the WHERE clause.
A subquery is used to return data that will be used in the main query as a condition to further
restrict the data to be retrieved.

Subqueries can be used with the SELECT, INSERT, UPDATE, and DELETE statements along
with the operators like =, <, >, >=, <=, IN, BETWEEN, etc.

Subqueries with the SELECT Statement

Subqueries are most frequently used with the SELECT statement. The basic syntax is as follows −

SELECT column_name [, column_name ]


FROM table1 [, table2 ]
WHERE column_name OPERATOR
(SELECT column_name [, column_name ]
FROM table1 [, table2 ]
[WHERE])

Find the names of sailors who have reserved boat 103.


SELECT S.sname
FROM Sailors S
WHERE S.sid IN(SELECT R.sid FROM Reserves R WHERE R.bid = 103 )

Find the names of sailors who have reserved a red boat.


SELECT S.sname FROM Sailors S
WHERE S.sid IN ( SELECT R.sid FROM Reserves R WHERE R.bid IN(SELECT B.bid
FROM Boats B WHERE B.color = ‘red’ )

Find the names of sailors who have reserved a red and a green boat.
SELECT S.sname
FROM Sailors S, Reserves R1, Boats B1, Reserves R2, Boats B2
WHERE S.sid = R1.sid AND R1.bid = B1.bid AND S.sid = R2.sid
AND R2.bid = B2.bid AND B1.color=‘red’ AND B2.color = ‘green’

Aggregate functions :
Aggregate functions in DBMS take multiple rows from the table and return a value according to
the query.
All the aggregate functions are used in Select statement.
Syntax :
SELECT <FUNCTION NAME> (<PARAMETER>) FROM <TABLE NAME>;

AVG Function
This function returns the average value of the numeric column that is supplied as a parameter.
Example: Write a query to select average salary from employee table.
Select AVG(salary) from Employee;

COUNT Function
The count function returns the number of rows in the result. It does not count the null values.
Example: Write a query to return number of rows where salary > 20000.
Select COUNT(*) from Employee where Salary > 20000;
Types −
 COUNT(*): Counts all the number of rows of the table including null.
 COUNT( COLUMN_NAME): count number of non-null values in column.
 COUNT( DISTINCT COLUMN_NAME): count number of distinct values in a column.

MAX Function
The MAX function is used to find maximum value in the column that is supplied as a parameter. It
can be used on any type of data.
Example − Write a query to find the maximum salary in employee table.
Select MAX(salary) from Employee

SUM Function
This function sums up the values in the column supplied as a parameter.
Example: Write a query to get the total salary of employees.
Select SUM(salary) from Employee

Find the average age of all sailors.


SELECT AVG(S.age) FROM Sailors S
Find the name and age of the oldest sailor.
SELECT S.sname, MAX (S.age) FROM Sailors S
Find the names of sailors who are older than the oldest sailor with a rating of 10.
SELECT S.sname From Sailors S WHERE S.age> (SELECT MAX(S2.age) FROM Sailor S2
WHERE S2.rating=10)
Find the age of the youngest sailor for each rating level.
SELECT S.rating, MIN (S.age) FROM Sailors S GROUP BY S.rating
Find the sailors with the highest rating.
SELECT S.sid FROM Sailors S WHERE S.rating>=ALL(SELECT S2.rating FROM Sailors S2).
GROUP BY :
The SQL GROUP BY clause is used in collaboration with the SELECT statement to
arrange identical data into groups. This GROUP BY clause follows the WHERE clause
in a SELECT statement and precedes the ORDER BY clause.

Syntax
The basic syntax of a GROUP BY clause is shown in the following code block. The
GROUP BY clause must follow the conditions in the WHERE clause and must precede
the ORDER BY clause if one is used.
SELECT column1, column2
FROM table_name
WHERE [ conditions ]
GROUP BY column1, column2
ORDER BY column1, column2

The SQL HAVING Clause

The HAVING clause was added to SQL because the WHERE keyword cannot be used
with aggregate functions.

HAVING Syntax
SELECT column_name(s)
FROM table_name
WHERE condition
GROUP BY column_name(s)
HAVING condition
ORDER BY column_name(s);
NULL VALUES: A field with a NULL value is a field with no value.
If a field in a table is optional, it is possible to insert a new record or update a record without adding a
value to this field. Then, the field will be saved with a NULL value.
SELECT column_names
FROM table_name
WHERE column_name IS NULL;
SELECT CustomerName, ContactName, Address
FROM Customers
WHERE Address IS NULL;
SELECT column_names
FROM table_name
WHERE column_name IS NOT NULL;

Complex Integrity Constraints in SQL :


Constraints over a Single Table Table :
We can specify complex constraints over a single table using table constraints which we have
the form check conditional expression . For Example, to ensure that rating must be an integer in
the range 1 to 10, we can use :
Create TABLE Sailors (sid INTEGER, sname CHAR(10), rating INTEGER, age INTEGER, PRIMARY
KEY (sid), CHECK (rating >= 1 AND rating <=10));

To enforce the constraints that interlake boat cannot be reserved we could use,
Create TABLE Reserves (sid INTEGER, bid INTEGER, day DATE, FOREIGN KEY (sid) REFERENCES
Sailors FOREIGN KEY (bid) REFERENCES Boats, CONSTRAINT noInterLakeRes CHECK (Interlake <>
(SELECT B.bname FROM Boats B WHERE B.bid = Reserves.bid)));

Domain Constraints
CREATE DOMAIN ratingval INTEGER DEFAULT 0 CHECK (VALUE >= 1 AND VALUE <=10)

Triggers:
A trigger is a stored procedure in database which automatically invokes
whenever a special event in the database occurs. For example, a trigger can
be invoked when a row is inserted into a specified table or when certain
table columns are being updated.
You can write triggers that fire whenever one of the following operations occurs:
– DML statements (INSERT, UPDATE, DELETE) on a particular table or view, issued by any
user
– DDL statements (CREATE or ALTER primarily) issued either by a particular schema/user
or by any schema/user in the database
– Database events, such as logon/logoff, errors, or startup/shutdown, also issued either
by a particular schema/user or by any schema/user in the database
– You can also use triggers to:
– Automatically generate derived column values
– Prevent invalid transactions
– Enforce complex security authorizations
– Enforce referential integrity across nodes in a distributed database
– Enforce complex business rules
– Provide transparent event logging
– Provide auditing
– Maintain synchronous table replicates
– Gather statistics on table access
– Modify table data when DML statements are issued against views
– Publish information about database events, user events, and SQL statements to
subscribing applications

Parts of a Trigger :
A trigger has three basic parts:
 A triggering event or statement
 A trigger restriction
 A trigger action
Syntax:
create trigger [trigger_name]
[before | after]
{insert | update | delete} on [table_name]
[for each row]
[trigger_body]

1. create trigger [trigger_name]: Creates or replaces an existing trigger with the


trigger_name.
2. [before | after]: This specifies when the trigger will be executed.
3. {insert | update | delete}: This specifies the DML operation.
4. on [table_name]: This specifies the name of the table associated with the
trigger.
5. [for each row]: This specifies a row-level trigger, i.e., the trigger will be
executed for each row being affected.
6. [trigger_body]: This provides the operation to be performed as trigger is fired

Remove Trigger:

• The DROP TRIGGER statement deletes a trigger from the database.


• Here is the basic syntax of the DROP TRIGGER statement:
• DROP TRIGGER [IF EXISTS] [schema_name.]trigger_name;

Active Databases
Active Database is a database consisting of set of triggers. These databases
are very difficult to be maintained because of the complexity that arises in
understanding the effect of these triggers. In such database, DBMS initially
verifies whether the particular trigger specified in the statement that modifies
the database) is activated or not, prior to executing the statement.
If the trigger is active then DBMS executes the condition part and then executes
the action part only if the specified condition is evaluated to true. It is possible to
activate more than one trigger within a single statement.
Features of Active Database:
1. It possess all the concepts of a conventional database i.e. data modelling
facilities, query language etc.
2. It supports all the functions of a traditional database like data definition, data
manipulation, storage management etc.
3. It supports definition and management of ECA rules.
4. It detects event occurrence.
5. It must be able to evaluate conditions and to execute actions.
6. It means that it has to implement rule execution.
Advantages :
1. Enhances traditional database functionalities with powerful rule processing
capabilities.
2. Enable a uniform and centralized description of the business rules relevant
to the information system.
3. Avoids redundancy of checking and repair operations.
4. Suitable platform for building large and efficient knowledge base and expert
systems.
SCHEMA REFINEMENT :

Schema Refinement is a technique of organizing the data in the database. It is a systematic


approach of decomposing tables to eliminate data redundancy and undesirable characteristics
like Insertion, Update and Deletion Anomalies.

REDUNDANCY :
Redundancy takes place when there are more than one or multiple copies of the same relation in a
database. Simply the storage of same or similar value more than once in the field is referred to
as Redundancy.

Redundancy means having multiple copies of same data in the database. This problem arises
when a database is not normalized
Suppose a table of student details attributes are: student Id, student name, college name, college
rank, course opted.

As it can be observed that values of attribute college name, college rank, course is being repeated
which can lead to problems. Problems caused due to redundancy are: Insertion anomaly, Deletion
anomaly, and Updation anomaly.
1. Insertion Anomaly –
If a student detail has to be inserted whose course is not being decided yet then insertion will not
be possible till the time course is decided for student.

This problem happens when the insertion of a data record is not possible without adding some
additional unrelated data to the record.
2. Deletion Anomaly –
If the details of students in this table is deleted then the details of college will also get deleted
which should not occur by common sense.
This anomaly happens when deletion of a data record results in losing some unrelated
information that was stored as part of the record that was deleted from a table.
3. Updation Anomaly –
Suppose if the rank of the college changes then changes will have to be all over the database
which will be time-consuming and computationally costly.

If updation do not occur at all places then database will be in inconsistent state.

Normalization :
Normalization is the process of minimizing redundancy from a relation or set of
relations. Redundancy in relation may cause insertion, deletion, and update anomalies.
So, it helps to minimize the redundancy in relations. Normal forms are used to
eliminate or reduce redundancy in database tables.
Normal Description
Form

1NF A relation is in 1NF if it contains an atomic value.

2NF A relation will be in 2NF if it is in 1NF and all non-key attributes are fully functional dependent
on the primary key.

3NF A relation will be in 3NF if it is in 2NF and no transition dependency exists.

BCNF A stronger definition of 3NF is known as Boyce Codd's normal form.

4NF A relation will be in 4NF if it is in Boyce Codd's normal form and has no multi-valued
dependency.

5NF A relation is in 5NF. If it is in 4NF and does not contain any join dependency, joining should
be lossless.

Normalization
If a database design is not perfect, it may contain anomalies, which are like a bad dream for any
database administrator. Managing a database with anomalies is next to impossible.
Normalization is a method to remove all these anomalies and bring the database to a consistent
state.

First Normal Form


First Normal Form is defined in the definition of relations (tables) itself. This rule defines that all
the attributes in a relation must have atomic domains. The values in an atomic domain are
indivisible units.

We re-arrange the relation (table) as below, to convert it to First Normal Form.

Each attribute must contain only a single value from its pre-defined domain.

Second Normal Form


Before we learn about the second normal form, we need to understand the following −
 Prime attribute − An attribute, which is a part of the candidate-key, is known as a prime
attribute.
 Non-prime attribute − An attribute, which is not a part of the prime-key, is said to be a
non-prime attribute.
If we follow second normal form, then every non-prime attribute should be fully functionally
dependent on prime key attribute. That is, if X → A holds, then there should not be any proper
subset Y of X, for which Y → A also holds true.
We see here in Student_Project relation that the prime key attributes are Stu_ID and Proj_ID.
According to the rule, non-key attributes, i.e. Stu_Name and Proj_Name must be dependent upon
both and not on any of the prime key attribute individually. But we find that Stu_Name can be
identified by Stu_ID and Proj_Name can be identified by Proj_ID independently. This is
called partial dependency, which is not allowed in Second Normal Form.

We broke the relation in two as depicted in the above picture. So there exists no partial dependency.

Third Normal Form


For a relation to be in Third Normal Form, it must be in Second Normal form and the following
must satisfy −

 No non-prime attribute is transitively dependent on prime key attribute.


 For any non-trivial functional dependency, X → A, then either −
o X is a superkey or,

o A is prime attribute.
We find that in the above Student_detail relation, Stu_ID is the key and only prime key attribute.
We find that City can be identified by Stu_ID as well as Zip itself. Neither Zip is a superkey nor
is City a prime attribute. Additionally, Stu_ID → Zip → City, so there exists transitive
dependency.
To bring this relation into third normal form, we break the relation into two relations as follows −

Boyce-Codd Normal Form


Boyce-Codd Normal Form (BCNF) is an extension of Third Normal Form on strict terms. BCNF
states that −

 For any non-trivial functional dependency, X → A, X must be a super-key.


In the above image, Stu_ID is the super-key in the relation Student_Detail and Zip is the super-
key in the relation ZipCodes. So,
Stu_ID → Stu_Name, Zip
and
Zip → City
Which confirms that both the relations are in BCNF.
Fourth Normal Form

o A relation will be in 4NF if it is in Boyce Codd normal form and has no multi-valued dependency.
o For a dependency A → B, if for a single value of A, multiple values of B exists, then the relation
will be a multi-valued dependency.

Example
STUDENT

STU_ID COURSE HOBBY

21 Computer Dancing

21 Math Singing

34 Chemistry Dancing

74 Biology Cricket

59 Physics Hockey

The given STUDENT table is in 3NF, but the COURSE and HOBBY are two independent entity.
Hence, there is no relationship between COURSE and HOBBY.

STUDENT_COURSE

STU_ID COURSE

21 Computer

21 Math

34 Chemistry

74 Biology

59 Physics
STUDENT_HOBBY

STU_ID HOBBY

21 Dancing

21 Singing

34 Dancing

74 Cricket

59 Hockey

Fifth Normal Form / Projected Normal Form (5NF):


A relation R is in 5NF if and only if every join dependency in R is implied by the
candidate keys of R. A relation decomposed into two relations must have loss-less join
Property, which ensures that no spurious or extra tuples are generated, when relations
are reunited through a natural join.
Properties – A relation R is in 5NF if and only if it satisfies following conditions:
1. R should be already in 4NF.
2. It cannot be further non loss decomposed (join dependency)
Table – ACP
Agent Company Product

A1 PQR Nut

A1 PQR Bolt

A1 XYZ Nut

A1 XYZ Bolt
Agent Company Product

A2 PQR Nut

The relation ACP is again decompose into 3 relations. Now, the natural Join of all the
three relations will be shown as:

Table – R1
Agent Company

A1 PQR

A1 XYZ

A2 PQR

Table – R2

Agent Product

A1 Nut

A1 Bolt

A2 Nut
Table – R3

Company Product

PQR Nut

PQR Bolt

XYZ Nut

XYZ Bolt

Advantages of Normalization
o Normalization helps to minimize data redundancy.
o Greater overall database organization.
o Data consistency within the database.
o Much more flexible database design.
o Enforces the concept of relational integrity.

Disadvantages of Normalization
o You cannot start building the database before knowing what the user needs.
o The performance degrades when normalizing the relations to higher normal forms,
i.e., 4NF, 5NF.
o It is very time-consuming and difficult to normalize relations of a higher degree.
o Careless decomposition may lead to a bad database design, leading to serious
problems.
DECOMPOSITION :
Decomposition is to break a relation into multiple relations to bring it into an
appropriate normal form. It helps to remove redundancy, inconsistencies,
and anomalies from a database. The decomposition of a relation R in a relational schema
is the process of replacing the original relation R with two or more relations in a
relational schema.

Issues of decomposition:
There are many problems regarding the decomposition in DBMS mentioned below:
 Redundant Storage
Many instances where the same information gets stored in a single place can confuse the
programmers. It will take lots of space in the system.
 Insertion Anomalies
It isn’t essential for storing important details unless some kind of information is stored in a consistent
manner.
 Deletion Anomalies
It isn’t possible to delete some details without eliminating any sort of information.

Lossless Decomposition
Decomposition is lossless if it is feasible to reconstruct relation R from decomposed tables using
Joins. This is the preferred choice. The information will not lose from the relation when
decomposed. The join would result in the same original relation.
A lossless Join decomposition ensures two things:
 No information is lost while decomposing from the original relation.
 If we join back the sub decomposed relations, the same relation that was
decomposed is obtained.
We can follow certain rules to ensure that the decomposition is a lossless join
decomposition Let’s say we have a relation R and we decomposed it into R1 and R2, then
the rules are:
1. The union of attributes of both the sub relations R1 and R2 must contain all the
attributes of original relation R.
R1 ∪ R2 = R
2. The intersection of attributes of both the sub relations R1 and R2 must not be null,
i.e., there should be some attributes that are present in both R1 and R2.
R1 ∩ R2 ≠ ∅
3. The intersection of attributes of both the sub relations R1 and R2 must be the
superkey of R1 or R2, or both R1 and R2.
R1 ∩ R2 = Super key of R1 or R2
Let us see an example −
<EmpInfo>
Emp_ID Emp_Name Emp_Age Emp_Location Dept_ID Dept_Name
E001 Jacob 29 Alabama Dpt1 Operations

E002 Henry 32 Alabama Dpt2 HR

E003 Tom 22 Texas Dpt3 Finance

Decompose the above table into two tables:


<EmpDetails>
Emp_ID Emp_Name Emp_Age Emp_Location
E001 Jacob 29 Alabama

E002 Henry 32 Alabama


E003 Tom 22 Texas

<DeptDetails>
Dept_ID Emp_ID Dept_Name
Dpt1 E001 Operations
Dpt2 E002 HR
Dpt3 E003 Finance

Now, Natural Join is applied on the above two tables −


The result will be −
Emp_ID Emp_Name Emp_Age Emp_Location Dept_ID Dept_Name
E001 Jacob 29 Alabama Dpt1 Operations
E002 Henry 32 Alabama Dpt2 HR
E003 Tom 22 Texas Dpt3 Finance

Therefore, the above relation had lossless decomposition i.e. no loss of information.
Decomposition is the process of breaking an original relation into multiple sub relations.
Decomposition helps to remove anomalies, redundancy, and other problems in a DBMS.
Decomposition can be lossy or lossless.
An ideal decomposition should be lossless join decomposition and dependency
preserving.

You might also like