0% found this document useful (0 votes)
31 views

Unit1 And2 Dbms

This document provides an overview and contents of Unit 2 which presents details about the entity relationship (ER) model for database design. The ER model provides a high-level view of how to design a database using entities, attributes, and relationships. The unit will cover the ER model features and how to use it for conceptual database design by mapping attributes and entities with relationships.

Uploaded by

Hemanthi Dhanala
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

Unit1 And2 Dbms

This document provides an overview and contents of Unit 2 which presents details about the entity relationship (ER) model for database design. The ER model provides a high-level view of how to design a database using entities, attributes, and relationships. The unit will cover the ER model features and how to use it for conceptual database design by mapping attributes and entities with relationships.

Uploaded by

Hemanthi Dhanala
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 35

Course Overview

Data base management has evolved form of a specialized computer application to a central
component of a modern computing environment. As such knowledge about data base systems has become
an essential part of an education in computer science. So this course has introduced to present the
fundamental concepts of data base systems. In this course the concepts are included are database design
Database languages, and data base – system implementation

UNIT – 1

OVER VIEW:

Unit – 1 provides a general overview of the nature and purpose of database systems. It also explains how
the concept of a database system has developed, what the common features of database systems are, what
a database system does for the user, and how a database system interfaces with operating systems. This
unit is motivational, historical, and explanatory in nature.

Details about the entity relationship model. This model provides a high level view of the issues. To design
the data base we need to follow proper way that way is called data model. So we see how to use the E-R
model to design the data base.

CONTENTS:

5888 Introduction to database systems

5889 File systems Vs. DBMS


5890 Various data models
5891 Levels of abstraction
5892 Database languages
5893 Structure of DBMS
DEFINITION OF DBMS

DBMS is software which is used to manage the collection of interrelated data.

File systems Vs DBMS:

The typical file processing system is supported by the operating systems. Files are created and
manipulated by writing programs so the permanent records are stored in various files. Before the
advent of DBMS, organizations typically stored the information using such systems.

Ex: Using COBOL we can maintain several files (collection of records) to access those files we
have to go through the application programs which have written for creating files, updating file,
inserting the records

The problems in file processing system are

23 Data redundancy and consistency


24 Difficulty in accessing data
25 Data isolation
26 Integrity problems
27 Atomicity problems
28 Security problems

To solve the above problems DBMS has been invented.

View of data:

The man purpose of DBMS is to provide users with an abstract view of the data.

The data abstraction is in three levels.

5888 Physical level: How the data are actually stored that means what data structures are used to store
data on Hard disk
Ex: Sequential , Tree structured

5889 Logical Level : What data are stored in database


5890 View level : It is the part of data base
Ex: Required records in table.

Instance:

The collection of information stored in the database at a particular moment is called an instance of the
data base.

Data base schema:

23 Logical schema
24 Physical schema
Data independence:

The ability to modify schema definition in one level with affecting a schema definition in the next higher
level is called data independence.

5888 Physical independence


5889 Logical independence

Data models:

Underlying the structure of a data base is the data model. The collection of conceptual tools for describing
data, data relationships, data semantics.

Three types of data models are there:

23 Object based logical model:


These are used to describe the data at logical level, view level. It is divided into several types.

Entity relationship model

24 Object oriented model


25 The semantic data model
26 The function data model

27 Record based logical model :

In contrast to object based model they are used both to specify the overall logical structure of data
base and to pride a higher level description of the implementation.

28 Relation model
29 Network model
30 Hierarchical model

Database languages:

Database system provides two types of languages

31 Data definition language :

Database schema is specified by a set of definitions expressed by a special language called


DDL.

5888 Data manipulation language:


This enables the user to access or manipulates data organized by the appropriate data model.

Database administrator: the person who has the central control over the DBMS is called the
data base administrator (DBA).

The functions of DBA:

23 Schema definition
24 Access methods definition
25 Schema and physical organization modification
26 Granting authorization for data
27 Integrity constraint specification

Structure of a DBMS

Figure 1.3 shows the structure of a typical DBMS.


This unit-2 presents details about the entity relationship model. This model provides a high level view of
the issues. To design the data base we need to follow proper way that way is called data model. So we see
how to use the E-R model to design the data base.

Contents :

5888 Over view of data base design


5889 ER model
5890 Features of ER model
5891 Conceptual design using ER model

Database Design:

The database design process can be divided into six steps.

The ER model is most relevant to the first three steps.

0 Requirements Analysis :

The very first step in designing a database application is to understand what data is to be stored in the
database, what applications must be built on top of it, and what operations are most frequent and
subject to performance requirements. In other words, we must find out what the users want from the
database.

0 Conceptual Database Design :


The information gathered in the requirements analysis step is used to develop to high level
description of the data to the stored in the database, along with the constraints that are known to hold
over this data. This step is often carried out using the ER model, or a similar high level data model.

5888 Logical Database Design:

We must choose a DBMS to implement our database design, and convert the conceptual database
design into a database schema in the data model of the chosen DBMS.

5889 Schema Refinement :

The fourth step in database design is to analyse the collection of relations in our relational
database schema to identify potential problems, and to refine it. In contrast to the requirements
analysis and conceptual design steps, which are essentially subjective, schema refinement can be
guided by some elegant and powerful theory.

23 Physical Database Design :

In this step we must consider typical expected workloads that our database must support and
further refine the database design to ensure that it meets desired performance criteria. This tep
may simply involve building indexes on some tables and clustering some tables, or it may involve
a substantial redesign of parts of the database schema obtained from the earlier design steps.

24 Security Design:

In this step, we identify different user groups and different roles played by various users (Eg : the
development team for a product, the customer support representatives, the product manager ).
For each role and user group, we must identify the parts of the database that they must be able to
access and the parts of the database that they should not be allowed to access, and take steps to
ensure that they can access.

Over view of ER (Entity – Relationship) model:

The entity relationship (E-R) data model is based on a perception of a real world that consists of a
set of basic objects called entities, and of relationships among these objects.

It was developed to facilitate database design by allowing the specificatin of an enterprise


schema, which represents the overall logical structure of a database.

The database structure, employing the E-R model is usually shown pictorially using entity –
relationship (E-R) diagrams. The entities and the relationships between them are shown in fig.
Using the following conventions :

5888 An entity set is shown as a rectangle.


5889 A diamond represents the relationship among a number of entities, which are connected to the
diamond by lines.
5890 The attributes, shown as ovals, are connected to the entities or relationship by lines.
5891 Diamonds, ovals, and rectangles are labeled. The type of relationship existing between the entities is
represented by giving the cardinality of the relationship on the line joining the relationship to the entity.

Mapping Cardinalities :

Mapping Cardinalities, or cardinality rations, express the number of entities to which another enlity can
be associated via a relationship set.

Mapping cardinalities are most useful in describing binary relationship sets, although occasionally they
contribute to the description of relationship sets that involve more than two entity sets.

For, a binary relationship set R between entiry sets A and B, the mapping cardinality must be one of the
following:
One to One :

An entity in A is associated with at most one entity in B, and an entity in B is associated with at most one
entity in A.

One to many .

An entity in A is associated with any number of entities in B. An entity in B. However, can be associated
with at most one entity in A.

Many to one :

An entity in A is associated with at most one entity in B.

An entity in B however, can be associated with any number of entities in A.

Many to Many:

An entity in A is associated with any number of entities in B


And an entity in B is associated with any number of entities in A

The appropriate mapping cardinality for a particular relationship set is obviously dependent on the real
world situation that is being modeled by the relationship set. The overall logical structure of a database
can be expressed graphically by an E-R diagram, which is built up from the following components.

23 Rectangles, which represent entity sets


24 Ellipse, which represent attributes
25 Diamonds, which represent relationship sets
26 Lines, which link attributes to entity sets and entity sets to relationship sets
27 Double ellipses, which represent multivalued attributes
28 Double lines, which indicate total participation of an entity in a relationship set

Entities:

An entity as a thing which can be distinctly identified.

If then goes on to classify entities into regular entities and weak entities.

A weak entity is an entity that is existence dependent on some other entity, in the sence that it cannot exist
if that other entity does not also exist.

For example in Fig. An employee’s dependents might be weak entities – they cannot exist ( so far as the
database is concerned) if the relevant employee does not exist. In particular, if a given employee is delete,
all dependents of that employee must be deleted too.
A regular entity, by contrast, is an entity that is not weak.

Eg : Employees might be regular entities

Note : some use the term “strong entity” instead of “regular entity”

Properties ( Attributes )

Entities and also relationships, have properties.

All entities or relationships of a given type have certain kinds of properties in common, for example, all
employees have an employee number, a name, a salary, and so on.

Each kind of property draw its values from a corresponding value set ( i.e domain, in relational terms)

Furthermore, properties can be :


5888 simple or composite
For Eg: the composite property employee name might be made up of the simple properties first
name, middle initial and last name

5889 Key ( i.e. unique, possibly within some context )


For eg: a dependent’s name might be unique within the context of a given employee

5890 Single or multi valued ( in other words, repeating groups are permitted ) all properties shown in
fig. Are single valued, but if
Eg : a given supplier could have several distinct supplier locations, then supplier city might be a
multi valued property.

5891 Missing
Ex: Unknown or not applicable

5892 Base or derived :


For Ex: total quantity for a given part might be derived as the sum of the individual shipment
quantities for that part.

Note : some use the term “ attibute” instead of “property” in an E/R context.

Relationships :

Relationship as “ an association among entities”

For Ex: there is a relationship called DEPT –EMP between departments and employees, representing the
fact that certain departments employ certain employees.
As with entities, it is necessary in principle to distinguish between relationship types and relationship
instances.

The entries involved in a given relationship are said to be participants in that relationship. The number of
participants in a given relationship is called the degree of that relationship.

Let R be a relationship type that involves entity type E as a participant. If every instance of E participates
in atleast one instance of R, then the participation of E in R is said to be total, otherwise it is said to be
partial.

For Ex: if every employee must belong to a department, then the participation of employees in DEPT-
EMP is total; if it is possible for a given department to have no employees at all, then the participation of
departments in DEPT – EMP is partial.

The database structure, employing the E-R model is usually shown pictorially using entity – relationship.

(E-R) diagrams : The entities and the relationships between them are shown in fig. Using the following
conventions :
23 an entity set is shown as a rectangle
24 A diamond represents the relationship among a number of entities, which are connected to the
diamond by lines
25 The attributes, shown as ovals, are connected to the entities or relationship by lines.
26 Diamonds, ovals, and rectangles are labeled. The type of relationship existing between the entities is
represented by giving the cardinality of the relationship on the line joining the relationship to the entity.

In Unit – 2 we discuss about Data storage and retrieval. It deals with disk, file, and file system structure,
and with the mapping of relational and object data to a file system. A variety of data access techniques are
presented in this unit , including hashing, B+ - tree indices, and grid file indices. External sorting which
will be done in secondary memory is discussed here.

Conceptual Database Design With The ER Model

Developing an ER diagram presents several choices, including the following:

5888 Should a concept be modeled as an entity or an attribute?


5889 Should a concept be modeled as an entity or a relationship?
5890 What are the relationship sets and their participating entity sets?
5891 Should we use binary or ternary relationships?
5892 Should we use aggregation?

Overview :

The Relational Model defines two root languages for accessing a relational database -- Relational Algebra
and Relational Calculus. Relational Algebra is a low-level, operator-oriented language. Creating a query
in Relational Algebra involves combining relational operators using algebraic notation. Relational
Calculus is a high-level, declarative language. Creating a query in Relational Calculus involves describing
what results are desired.

The basic form of SQL,SQL (Structured Query Language) is a database sublanguage for querying and
modifying relational databases. The basic structure in SQL is the statement how to write the queries and
modify tables and columns.
Contents:

23 Introduction to the Relational Model

24 Integrity Constraint Over relations

25 Enforcing Integrity constraints

26 Querying relational data

27 Introduction to Views, Destroying /altering Tables and Views.

Introduction to the Relational Model

The main construct for representing data in the relational model is a relation. A relation
consists of a relation schema and a relation instance. The relation instance is a table, and the
relation schema describes the column heads for the table. We first describe the relation schema
and then the relation instance. The schema specifies the relation’s name, the name of each field
(or column, or attribute), and the domain of each field. A domain is referred to in a relation
schema by the domain name and has a set of associated values.

Eg:

Students(sid: string, name: string, login: string, age: integer, gpa: real)

This says, for instance, that the field named sid has a domain named string. The set of values
associated with domain string is the set of all character strings.

An instance of a relation is a set of tuples, also called records, in which each tuple has the
same number of fields as the relation schema. A relation instance can be thought of as a table in
which each tuple is a row, and all rows have the same number of fields.
An instance of the Students relation appears in Figure 3.1.
A relation schema specifies the domain of each field or column in the relation instance. These
domain constraints in the schema specify an important condition that we want each instance of
the relation to satisfy: The values that appear in a column must be drawn from the domain
associated with that column. Thus, the domain of a field is essentially the type of that field, in
programming language terms, and restricts the values that can appear in the field.

Domain constraints are so fundamental in the relational model that we will henceforth consider
only relation instances that satisfy them; therefore, relation instance means relation instance that
satisfies the domain constraints in the relation schema.

The degree, also called arity, of a relation is the number of fields. The cardinality of a relation
instance is the number of tuples in it. In Figure 3.1, the degree of the relation (the number of
columns) is five, and the cardinality of this instance is six.

A relational database is a collection of relations with distinct relation names. The relational
database schema is the collection of schemas for the relations in the database.

Creating and Modifying Relations

The SQL-92 language standard uses the word table to denote relation, and we will often
follow this convention when discussing SQL. The subset of SQL that supports the creation,
deletion, and modification of tables is called the Data Definition Language (DDL).

To create the Students relation, we can use the following statement:

The CREATE TABLE statement is used to define a new table.


CREATE TABLE Students ( sid CHAR(20), name CHAR(30), login CHAR(20), age
INTEGER, gpa REAL )

Tuples are inserted using the INSERT command. We can insert a single tuple into the Students table
as follows:

INSERT INTO Students (sid, name, login, age, gpa) VALUES (53688, ‘Smith’, ‘smith@ee’, 18,
3.2)

We can delete tuples using the DELETE command. We can delete all Students tuples with name equal to
Smith using the command:

DELETE FROM Students S WHERE S.name = ‘Smith’

We can modify the column values in an existing row using the UPDATE command. For example, we can
increment the age and decrement the gpa of the student with sid 53688:

UPDATE Students S SET S.age = S.age + 1, S.gpa = S.gpa - 1 WHERE S.sid = 53688

Integrity Constraints Over Relations

An integrity constraint (IC) is a condition that is specified on a database schema, and


restricts the data that can bstored in an instance of the database. If a database instance satisfies all
the integrity constraints specified on the database schema, it is a legal instance. A DBMS
enforces integrity constraints, in that it permits only legal instances to be stored in the database.

Key Constraints

Consider the Students relation and the constraint that no two students have the same student id.
This IC is an example of a key constraint. A key constraint is a statement that a certain minimal
subset of the fields of a relation is a unique identifier for a tuple. A set of fields that uniquely
identifies a tuple according to a key constraint is called a candidate key for the relation; we
often abbreviate this to just key.In the case of the Students relation, the (set of fields containing
just the) sid field is a candidate key.

There are two parts to the definition of (candidate) key:


0 Two distinct tuples in a legal instance (an instance that satisfies all ICs, including the key
constraint) cannot have identical values in all the fields of a key.

1 No subset of the set of fields in a key is a unique identifier for a tuple.

The first part of the definition means that in any legal instance, the values in the key fields
uniquely identify a tuple in the instance

The second part of the definition means, for example, that the set of fields {sid, name} is not a
key for Students, because this set properly contains the key {sid}.Theset {sid, name} is an
example of a superkey, which is a set of fields that contains a key.

Out of all the available candidate keys, a database designer can identify a primary key.
Intuitively, a tuple can be referred to from elsewhere in the database by storing the values of its
primary key fields. For example, we can refer to a Students tuple by storing its sid value.

Specifying Key Constraints in SQL

CREATE TABLE Students ( sid CHAR(20), name CHAR(30), login CHAR(20), age
INTEGER, gpa REAL, UNIQUE (name, age), CONSTRAINT StudentsKey PRIMARY KEY
(sid) )

This definition says that sid is the primary key and that the combination of name and age is also
a key. The definition of the primary key also illustrates how we can name a constraint by
preceding it with CONSTRAINT constraint-name. If the constraint is violated, the constraint
name is returned and can be used to identify the error.

Foreign Key Constraints

Sometimes the information stored in a relation is linked to the information stored in another relation. If
one of the relations is modified, the other must be checked, and perhaps modified, to keep the data
consistent. An IC involving both relations must be specified if a DBMS is to make such checks. The most
common IC involving two relations is a foreign key constraint.

Suppose that in addition to Students, we have a second relation:

Enrolled(sid: string, cid: string, grade: string)

To ensure that only bonafide students can enroll in courses, any value that appears in the sid field
of an instance of the Enrolled relation should also appear in the sid field of some tuple in the
Students relation. The sid field of Enrolled is called a foreign key and refers to Students. The foreign key in the
referencing relation (Enrolled, in our example) must match the primary key of the referenced relation (Students), i.e., it
must have the same number of columns and compatible data types, although the column names can be different.

Specifying Foreign Key Constraints in SQL

CREATE TABLE Enrolled ( sid CHAR(20), cid CHAR(20), grade CHAR(10), PRIMARY KEY
(sid, cid), FOREIGN KEY (sid) REFERENCES Students )

Enforcing Integrity Constraints

Consider the instance S1 of Students shown in Figure 3.1. The following insertion violates the primary
key constraint because there is already a tuple with the sid 53688, and it will be rejected by the DBMS:

INSERT INTO Students (sid, name, login, age, gpa) VALUES (53688, ‘Mike’, ‘mike@ee’, 17,
3.4)

The following insertion violates the constraint that the primary key cannot contain null:

INSERT INTO Students (sid, name, login, age, gpa) VALUES (null, ‘Mike’, ‘mike@ee’, 17,
3.4)

Querying Relational Data

A relational database query is a question about the data, and the answer consists of a new
relation containing the result. For example, we might want to find all students younger than 18 or
all students enrolled in Reggae203.
A query language is a specialized language for writing queries.

SQL is the most popular commercial query language for a relational DBMS. Consider the instance of the
Students relation shown in Figure 3.1. We can retrieve rows corresponding to students who are younger
than 18 with the following SQL query:

SELECT * FROM Students S WHERE S.age < 18

The symbol * means that we retain all fields of selected tuples in the result. The condition S.age
0 18 in the WHERE clause specifies that we want to select only tuples in which the age field has
a value less than 18. This query evaluates to the relation shown in Figure 3.6.

Introduction To Views

A view is a table whose rows are not explicitly stored in the database but are computed as needed from a
view definition. Consider the Students and Enrolled relations. Suppose that we are often interested in
finding the names and student identifiers of students who got a grade of B in some course, together with
the cid for the course. We can define a view for this purpose. Using SQL notation:

CREATE VIEW B-Students (name, sid, course) AS SELECT S.sname, S.sid, E.cid FROM Students S,
Enrolled E WHERE S.sid = E.sid AND E.grade = ‘B’

This view can be used just like a base table, or explicitly stored table, in defining new queries or
views. Given the instances of Enrolled and Students shown in Figure 3.4, BStudents contains
the tuples shown in Figure 3.18.
Destroying/Altering Tables and Views

To destroy views, use the DROP TABLE command. For example, DROP TABLE Students RESTRICT
destroys the Students table unless some view or integrity constraint refers to Students; if so, the command
fails. If the keyword RESTRICT is replaced by CASCADE, Students is dropped and any referencing
views or integrity constraints are (recursively) dropped as well; one of these two keywords must always
be specified. A view can be dropped using the DROP VIEW command, which is just like DROP TABLE.

ALTER TABLE modifies the structure of an existing table. To add a column called maiden-name to
Students, for example, we would use the following command:

ALTER TABLE Students ADD COLUMN maiden-name CHAR(10)

The definition of Students is modified to add this column, and all existing rows are padded with null
values in this column. ALTER TABLE can also be used to delete columns and to add or drop integrity
constraints on a table.

Relational Algebra

Relational algebra is one of the two formal query languages associated with the relational model.
Queries in algebra are composed using a collection of operators. A fundamental property is that every
operator in the algebra accepts (one or two) relation instances as arguments and returns a relation instance
as the result. This property makes it easy to compose operators to form a complex query —a relational
algebra expression is recursively defined to be a relation, a unary algebra operator applied to a single
expression, or a binary algebra operator applied to two expressions. We describe the basic operators of
the algebra (selection, projection, union, cross-product, and difference).
Selection and Projection

Relational algebra includes operators to select rows from a relation (σ)and to project columns (π).

These operations allows to manipulate data in a single relation. Consider the instance of the Sailors
relation shown in Figure 4.2, denoted as S2. We can retrieve rows corresponding to expert sailors by
using the s operator. The expression (S2) evaluates to the relation shown in Figure 4.4. The subscript
rating>8 specifies the selection criterion to be applied while retrieving tuples.
Set Operations

The following standard operations on sets are also available in relational algebra: union (∪), intersection (n), set-difference (-), and cross-product (×).

0 Union: R∪S returns a relation instance containing all tuples that occur in either relation instance R or relation instance S (or both). R and S must be
unioncompatible, and the schema of the result is defined to be identical to the schema of R.
1 Intersection: RnS returns a relation instance containing all tuples that occur in both R and S. The
relations R and S must be union-compatible, and the schema of the result is defined to be identical to the
schema of R.
2 Set-difference: R-S returns a relation instance containing all tuples that occur in R but not in S. The relations R and S must be union-compatible, and the
schema of the result is defined to be identical to the schema of R.
3 Cross-product: R×S returns a relation instance whose schema contains all the fields of R (in the same order as they appear in
R) followed by all the fields of S (in the same order as they appear in S). The result of R × S contains one tuple r, s (the concatenation of
tuples r and s) for each pair of tuples r ∈ R, s ∈ S. The cross-product opertion is sometimes called Cartesian product.
Joins

The join operation is one of the most useful operations in relational algebra and is the most commonly
used way to combine information from two or more relations. Although a join can be defined as a cross-
product followed by selections and projections, joins arise much more frequently in practice than plain
cross-products.

Condition Joins

The most general version of the join operation accepts a join condition c and a pair of relation instances as
arguments, and returns a relation instance. The join condition is identical to a selection condition in form.
The operation is defined as follows:
As an example, the result of .

Relational Calculus

Relational calculus is an alternative to relational algebra. In contrast to the algebra, which is


procedural, the calculus is nonprocedural, or declarative, in that it allows to describe the set of answers
without being explicit about how they should be computed.

The variant of the calculus that we present in detail is called the tuple relational calculus (TRC).
Variables in TRC take on tuples as values. In another variant, called the domain relational calculus
(DRC), the variables range over field values.

Tuple Relational Calculus

A tuple variable is a variable that takes on tuples of a particular relation schema as values. That is, every
value assigned to a given tuple variable has the same number and type of fields. A tuple relational
calculus query has the form { T | p(T) },where T is a tuple variable and p(T) denotes a formula that
describes T. The result of this query is the set of all tuples t for which the formula p(T)evaluates to true
with T = t. The language for writing formulas p(T) is thus at the heart of TRC and is essentially a simple
subset of first-order logic.

As a simple example, consider the following query.

Find all sailors with a rating above 7.


{S | S ∈ Sailors ∧ S.rating > 7}

Syntax of TRC Queries

Let Rel be a relation name, R and S be tuple variables, a an attribute of R,and b an attribute of S. Let op
denote an operator in the set {<, >, =, =, =, =}. An atomic formula is one of the following:
0 R ∈ Rel

1 R.a op S.b
2 R.a op constant, or constant op R.a
A formula is recursively defined to be one of the following, where p and q are themselves formulas, and
p(R) denotes a formula in which the variable R appears:

23
24
any atomic formula
¬p, p ∧ q, p ∨ q,orp ⇒ q
25 ∃R(p(R)), where R is a tuple variable
26 ∀R(p(R)), where R is a tuple variable

Domain Relational Calculus

A domain variable is a variable that ranges over the values in the domain of some attribute (e.g., the
variable can be assigned an integer if it appears in an attribute whose domain is the set of integers). A
DRC query has the form {x | p(x1,x2,...,xn)},where each x is either a domain variable or a constant and
p(x1,x2,...,xn) denotes a DRC formula whose only free variables are the variables among the x i, 1 ≤ i ≥
n. The result of this query is the set of all tuples x1,x2,...,xi for which the formula evaluates to true.

DRC formula is defined in a manner that is very similar to the definition of a TRC formula. The main di fference is that the variables are now
domain variables. Let op denote an operator in the set {<, >, =, =, =, =} and let X and Y be domain variables.

An atomic formula in DRC is one of the following:


<x1,x2,...,xn> ∈Rel,where Rel is a relation with n attributes; each x, 1 ≤i≥ n is either a variable or a constant.

X op Y
X op constant,or constant op X
A formula is recursively defined to be one of the following, where p and q are themselves formulas,
and p(X) denotes a formula in which the variable X appears:

any atomic formula


¬p, p ∧ q, p ∨ q,orp ⇒ q
∃X(p(X)), where X is a domain variable
∀X(p(X)), where X is a domain variable

Eg:Find all sailors with a rating above 7.


{<I, N, T, A>|<I, N, T, A>∈Sailors ∧ T>7}

The Form of a Basic SQL Query:

SELECT [ DISTINCT ] select-list

FROM from-list

WHERE qualification

Every query must have a SELECT clause, which specifies columns to be retained in the result, and a FROM
clause, which specifies a cross-product of tables. The optional WHERE clause specifies selection
conditions on the tables mentioned in the FROM clause.

Eg: 1. Find the names and ages of all sailors.

SELECT DISTINCT S.sname, S.age FROM Sailors S

2.Find all sailors with a rating above 7.

SELECT S.sid, S.sname, S.rating, S.age FROM Sailors AS S WHERE S.rating > 7
We now consider the syntax of a basic SQL query in detail.

The from-list in the FROM clause is a list of table names. A table name can be followed by a range
variable; a range variable is particularly useful when the same table name appears more than once in the
from-list.
The select-list is a list of (expressions involving) column names of tables named in the from-list.
Column names can be prefixed by a range variable.
The qualification in the WHERE clause is a boolean combination (i.e., an expression using the logical
connectives AND, OR,andNOT) of conditions of the form expression op expression,whereop is one of
the comparison operators {<, <=, = , <>, >=,>}.An expression is a column name, a constant, or an
(arithmetic or string) expression.
The DISTINCT keyword is optional. It indicates that the table computed as an answer to this query
should not contain duplicates, that is, two copies of the same row. The default is that duplicates are not
eliminated.

The following is the conceptual evaluation strategy of SQL query

Compute the cross-product of the tables in the from-list.

Delete those rows in the cross-product that fail the qualification conditions.

Delete all columns that do not appear in the select-list.

4. If DISTINCT is specified, eliminate duplicate rows.

ADDITIONAL TOPIC:

UNION, INTERSECT, AND EXCEPT

SQL provides three set-manipulation constructs that extend the basic query form. Since the answer to a query is a multiset of rows, it is natural to consider the use
of operations such as union, intersection, and difference. SQL supports these operations under the names UNION, INTERSECT,andEXCEPT.

Union:

Eg: Find the names of sailors who have reserved a red or a green boat.

SELECT S.sname FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid AND R.bid =
B.bid AND B.color = ‘red’

UNION

SELECT S2.sname FROM Sailors S2, Boats B2, Reserves R2 WHERE S2.sid = R2.sid AND R2.bid =
B2.bid AND B2.color = ‘green’
This query says that we want the union of the set of sailors who have reserved red boats and the set
of sailors who have reserved green boats.

Intersect:

Eg:Find the names of sailors who have reserved both a red and a green boat.

SELECT S.sname FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid AND R.bid =
B.bid AND B.color = ‘red’

INTERSECT

SELECT S2.sname FROM Sailors S2, Boats B2, Reserves R2 WHERE S2.sid = R2.sid AND R2.bid =
B2.bid AND B2.color = ‘green’

Except:

Eg:Find the sids of all sailors who have reserved red boats but not green boats.

SELECT S.sid FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid AND R.bid = B.bid AND
B.color = ‘red’

EXCEPT

SELECT S2.sid FROM Sailors S2, Reserves R2, Boats B2 WHERE S2.sid = R2.sid AND R2.bid
= B2.bid AND B2.color = ‘green’

SQL also provides other set operations: IN (to check if an element is in a given set), op ANY, op ALL (to
compare a value with the elements in a given set, using comparison operator op), and EXISTS (to check if a
set is empty). IN and EXISTS can be prefixed by NOT,withthe obvious modification to their meaning.
We cover UNION, INTERSECT,andEXCEPT in this section, and the other operations in Section 5.4.

NESTED QUERIES
A nested query is a query that has another query embedded within it; the embedded query is called a
subquery.

SQL provides other set operations: IN (to check if an element is in a given set),NOT IN(to check if an
element is not in a given set).

Eg:1. Find the names of sailors who have reserved boat 103.

SELECT S.sname FROM Sailors S

WHERE S.sid IN ( SELECT R.sid

FROM Reserves R

WHERE R.bid = 103 )

The nested subquery computes the (multi)set of sids for sailors who have reserved boat 103, and the top-
level query retrieves the names of sailors whose sid is in this set. The IN operator allows us to test
whether a value is in a given set of elements; an SQL query is used to generate the set to be tested.

2.Find the names of sailors who have not reserved a red boat.

SELECT S.sname FROM Sailors S

WHERE S.sid NOT IN ( SELECT R.sid

FROM Reserves R

WHERE R.bid IN ( SELECT B.bid

FROM Boats B

WHERE B.color = ‘red’ )


Correlated Nested Queries

In the nested queries that we have seen, the inner subquery has been completely independent of the outer
query. In general the inner subquery could depend on the row that is currently being examined in the
outer query .

Eg: Find the names of sailors who have reserved boat number 103.

SELECT S.sname FROM Sailors S

WHERE EXISTS ( SELECT *

FROM Reserves R

WHERE R.bid = 103 AND R.sid = S.sid )

The EXISTS operator is another set comparison operator, such as IN. It allows us to test whether a set
is nonempty.

Set-Comparison Operators

SQL also supports op ANY and op ALL, where op is one of the arithmetic comparison operators {<, <=,
=, <>, >=,>}.

Eg:1. Find sailors whose rating is better than some sailor called Horatio.

SELECT S.sid FROM Sailors S

WHERE S.rating > ANY ( SELECT S2.rating

FROM Sailors S2

WHERE S2.sname = ‘Horatio’ )


If there are several sailors called Horatio, this query finds all sailors whose rating is better than that
of some sailor called Horatio.

2.Find the sailors with the highest rating.

SELECT S.sid FROM Sailors S

WHERE S.rating >= ALL ( SELECT S2.rating

FROM Sailors S2 )

Aggregate Operators

SQL supports five aggregate operations, which can be applied on any column

COUNT ([DISTINCT] A): The number of (unique) values in the A column.

SUM ([DISTINCT] A): The sum of all (unique) values in the A column.

AVG ([DISTINCT] A): The average of all (unique) values in the A column.

MAX (A): The maximum value in the A column.

MIN (A): The minimum value in the A column.

Eg:1. Find the average age of all sailors.

SELECT AVG (S.age) FROM Sailors S

2.Count the number of sailors.

SELECT COUNT (*) FROM Sailors S


The GROUP BY and HAVING Clauses

Thus far, we have applied aggregate operations to all (qualifying) rows in a relation. Often we want to
apply aggregate operations to each of a number of groups of rows in a relation, where the number of
groups depends on the relation instance.

Syntax:

SELECT [ DISTINCT ] select-list

FROM from-list

WHERE qualification

GROUP BY grouping-list

HAVING group-qualification

Eg:Find the age of the youngest sailor for each rating level.

SELECT S.rating, MIN (S.age) FROM Sailors S GROUP BY S.rating

NULL VALUES

SQL provides a special column value called null to use where some column does not have a value to hold
or the value is unknown. We use null when the column value is either unknown or inapplicable.

Logical Connectives AND, OR, and NOT

The logical operators AND, OR,andNOT using a three-valued logic in which expressions evaluate to true,
false,or unknown.OR of two arguments evaluates to true if either argument evaluates to true, and to
unknown if one argument evaluates to false and the other evaluates to unknown. (If both arguments are
false, of course, it evaluates to false.) AND of two arguments evaluates to false if either argument
evaluates to false, and to unknown if one argument evaluates to unknown and the other evaluates to true
or unknown.
Outer Joins

The join operation that rely on null values, called outer joins, are supported in SQL. Consider the join of
two tables, say Sailors & Reserves. Tuples of Sailors that do not match some row in Reserves according
to the join condition c do not appear in the result. In an outer join, on the other hand, Sailor rows without
a matching Reserves row appear exactly once in the result, with the result columns inherited from
Reserves assigned null values.

In fact, there are several variants of the outer join idea. In a left outer join, Sailor rows without a
matching Reserves row appear in the result, but not vice versa. In a right outer join, Reserves rows
without a matching Sailors row appear in the result, but not vice versa. In a full outer join, both Sailors
and Reserves rows without a match appear in the result.

Triggers and Active Databases

A trigger is a procedure that is automatically invoked by the DBMS in response to specified changes to
the database, and is typically specified by the DBA. A database that has a set of associated triggers is
called an active database.

A trigger description contains three parts:

Event: A change to the database that activates the trigger.


Condition: A query or test that is run when the trigger is activated.
Action: A procedure that is executed when the trigger is activated and its condition is true.

Eg: The trigger called init count initializes a counter variable before every execution of an INSERT
statement that adds tuples to the Students relation. The trigger called incr count increments the counter
for each inserted tuple that satisfies the condition age < 18.
CREATE TRIGGER init count BEFORE INSERT ON Students /* Event */

DECLARE

count INTEGER;

BEGIN /* Action */

count := 0;

END

CREATE TRIGGER incr count AFTER INSERT ON Students /* Event */

WHEN (new.age < 18) /* Condition*/

FOR EACH ROW

BEGIN /* Action */

count:=count+1;

END

You might also like