DBMS CS208 M2Ktunotes - in
DBMS CS208 M2Ktunotes - in
DBMS CS208 M2Ktunotes - in
RELATIONAL MODEL
The relational model is today the primary data model for commercial data processing
applications. This model is simple and it has all the properties and capabilities required to process
data with storage efficiency.
Salient features of the relational model: –
Conceptually simple.
Fundamentals are intuitive and easy to pick up.
Powerful underlying theory: the relational model is the only database model that is powered
by formal mathematics, which results in excellent dividends when developing database
algorithms and techniques.
Easy-to-use database language: though not formally part of the relational model, part of its
success is due to SQL, the de facto language for working with relational databases.
Structure
• A relational database is a collection of tables.
– Each table has a unique name.
– Each table consists of multiple rows.
– Each row is a set of values that by definition are related to each other in some way; these
values conform to the attributes or columns of the table.
– Each attribute of a table defines a set of permitted values for that attribute; this set of permitted
set is the domain of that attribute.
For example, consider the instructor table of Figure 2.1, which stores information about
instructors. The table has four column headers: ID, name, deptname, and salary. Each row of this
table records information about an instructor, consisting of the instructor’s ID, name, dept name,
and salary. Similarly, the course table of Figure 2.2 stores information about courses, consisting
of a course id, title, dept name, and credits, for each course. Note that each instructor is identified
by the value of the column ID, while each course is identified by the value of the column course
id.
1
Module II
In general, a row in a table represents a relationship among a set of values. Since a table
is a collection of such relationships, there is a close correspondence between the concept of table
and the mathematical concept of relation, from which the relational data model takes its name. In
mathematical terminology, a tuple is simply a sequence (or list) of values. A relationship
between n values is represented mathematically by an n-tuple of values, i.e., a tuple with n values,
which corresponds to a row in a table.
Thus, in the relational model the term relation is used to refer to a table, while the term
tuple is used to refer to a row. Similarly, the term attribute refers to a column of a table.
We use the term relation instance to refer to a specific instance of a relation, i.e., containing a
specific set of rows. The instance of instructor shown in Figure 2.1 has 12 tuples, corresponding
to 12 instructors.
For each attribute of a relation, there is a set of permitted values, called the domain of
that attribute. Thus, the domain of the salary attribute of the instructor relation is the set of all
possible salary values, while the domain of the name attribute is the set of all possible instructor
names. We require that, for all relations r, the domains of all attributes of r be atomic.
A domain is atomic if elements of the domain are considered to be indivisible units. For example,
suppose the table instructor had an attribute phone number, which can store a set of phone
numbers corresponding to the instructor. Then the domain of phone number would not be atomic,
since an element of the domain is a set of phone numbers, and it has subparts, namely the
individual phone numbers in the set.
The important issue is not what the domain itself is, but rather how we use domain
elements in our database. Suppose now that the phone number attribute stores a single phone
number. Even then, if we split the value from the phone number attribute into a country code, an
area code and a local number, we would be treating it as a nonatomic value. If we treat each
phone number as a single indivisible unit, then the attribute phone number would have an atomic
domain.
2
Module II
• This definition of a database table originates from the pure mathematical concept of a relation,
from which the term “relational data model” originates.
– Formally, for a table r with n attributes a1 . . . an, each attribute ak has a domain Dk, and any
given row of r is an n-tuple (v1, . . ., vn) such that vk ∈ Dk.
– Thus, any instance of table r is a subset of the Cartesian product D1 × · · · × Dn.
– We require that a domain Dk be atomic — that is, we do not consider the elements of Dk to
be breakable into subcomponents.
– A possible member of any domain is null — that is, an unknown or non-existent value; in
practice, we try to avoid the inclusion of null in our databases because they can cause a
number of practical issues.
ER diagrams can be mapped to relational schema, that is, it is possible to create relational
schema using ER diagram. We cannot import all the ER constraints into relational model, but an
approximate schema can be generated.
There are several processes and algorithms available to convert ER Diagrams into
Relational Schema. Some of them are automated and some of them are manual. We may focus
here on the mapping diagram contents to relational basics.
ER diagrams mainly comprise of −
Entity and its attributes
Relationship, which is association among entities.
Mapping Entity
An entity is a real-world object with some attributes.
Mapping Relationship
A relationship is an association among entities.
3
Module II
Mapping Process
Create table for a relationship.
Add the primary keys of all participating Entities as fields of table with their respective
data types.
If relationship has any attribute, add each attribute as field of table.
Declare a primary key composing all the primary keys of participating entities.
Declare all foreign key constraints.
Mapping Weak Entity Sets
A weak entity set is one which does not have any primary key associated with it.
Mapping Process
Create table for weak entity set.
Add all its attributes to table as field.
Add the primary key of identifying entity set.
Declare all foreign key constraints.
4
Module II
Mapping Process
Need of Constraints :
Constraints in the database provide a way to guarantee that : the values of individual columns
are valid.
5
Module II
Domain Integrity
Domain integrity means the definition of a valid set of values for an attribute. You define 聽
- data type,
- length or size
- is null value allowed
- is the value unique or not for an attribute.
You may also define the default value, the range (values in between) and/or specific values for
the attribute. Some DBMS allow you to define the output format and/or input mask for the
attribute.
These definitions ensure that a specific attribute will have a right and proper value in the
database.
Domain Constraints specifies that what set of values an attribute can take. Value of each
attribute X must be an atomic value from the domain of X. The data type associated with
domains include integer, character, string, date, time, currency etc. An attribute value must be
available in the corresponding domain. For example, the employee ID must be unique, the
employee birthday is in the range [Jan 1, 1950, Jan 1, 2000]. Such information is provided in
logical statements called integrity constraints.
Key Constraints
Keys are attributes or sets of attributes that uniquely identify an entity within its entity
set. An Entity set E can have multiple keys out of which one key will be designated as
the primary key.
table requires a primary key. The primary key, nor any part of the primary key, can
contain NULL values. This is because NULL values for the primary key means we cannot
identify some rows. For example, in the EMPLOYEE table, Phone cannot be a key since some
people may not have a phone.
6
Module II
The entity integrity constraint states that primary keys can't be null. There must be a proper
value in the primary key field.
On the other hand, there can be null values other than primary key fields. Null value means that
one doesn't know the value for that field. Null value is different from zero value or space.
In the Car Rental database in the Car table each car must have a proper and unique Reg_No.
There might be a car whose rate is unknown - maybe the car is broken or it is brand new - i.e.
the Rate field has a null value. See the picture below.
Referential integrity
A foreign key must have a matching primary key or it must be null.This constraint is specified
between two tables (parent and child); it maintains the correspondence between rows in these
tables. It means the reference from a row in one table to another table must be valid. Examples
of Referential integrity constraint:
This rule states that if a foreign key in Table 1 refers to the Primary Key of Table 2, then every
value of the Foreign Key in Table 1 must be null or be available in Table 2. For example,
7
Module II
Let the table in which the foreign key is defined is Foreign Table or details table i.e. Table 1 in
above example and the table that defines the primary key and is referenced by the foreign key is
master table or primary table i.e. Table 2 in above example. Then the following properties must
be hold:
Records cannot be inserted into a foreign table if corresponding records in the master table do
not exist. Records of the master table or Primary Table cannot be deleted or updated if
corresponding records in the detail table actually exist.
DATABASE LANGUAGES
A query language is a language in which a user requests information from the database.
These languages are usually on a level higher than that of a standard programming language.
8
Module II
There are a number of “pure” query languages: The relational algebra is procedural, whereas the
tuple relational calculus and domain relational calculus are nonprocedural.
Relational Operations
All procedural relational query languages provide a set of operations that can be applied to
either a single relation or a pair of relations. These operations have the nice and desired property
that their result is always a single relation. This property allows one to combine several of these
operations in a modular way. Specifically, since the result of a relational query is itself a
relation, relational operations can be applied to the results of queries as well as to the given set
of relations.
The operations are expressed in SQL.
The most frequent operation is the selection of specific tuples from a single relation (say
instructor) that satisfies some particular predicate (say salary > $85,000). The result is a new
relation that is a subset of the original relation (in satisfying the predicate “salary is greater
than $85000”,we get the result shown in Figure 2.10.
9
Module II
Another frequent operation is to select certain attributes (columns) from a relation. The
result is a new relation having only those selected attributes. For example, suppose we want
a list of instructor IDs and salaries without listing the name and dept name values from the
instructor relation of Figure 2.1, then the result, shown in Figure 2.11, has the two attributes
ID and salary. in the result is derived from a tuple of the instructor relation but with only
selected attributes shown.
The join operation allows the combining of two relations by merging pairs of tuples, one
from each relation, into a single tuple. There are a number of different ways to join relations.
Figure 2.12 shows an example of joining the tuples from the instructor and department
tables with the new tuples showing the information about each instructor and the department
in which she is working. This result was formed by combining each tuple in the instructor
relation with the tuple in the department relation for the instructor’s department. In the form
of join shown in Figure 2.12, which is called a natural join, a tuple from the instructor
relation matches a tuple in the department relation if the values of their dept name attributes
are the same. All such matching pairs of tuples are present in the join result. In general, the
natural join operation on two relations matches tuples whose values are the same on all
attribute names that are common to both relations.
10
Module II
The Cartesian product operation combines tuples from two relations, but unlike the join
operation, its result contains all pairs of tuples from the two relations, regardless of whether
their attribute values match. Because relations are sets,we can perform normal set operations
on relations.
The Cartesian Product is also an operator which works on two sets. It is sometimes called
the CROSS PRODUCT or CROSS JOIN.
It combines the tuples of one relation with all the tuples of the other relation.
The union operation performs a set union of two “similarly structured” tables (say a table of all
graduate students and a table of all undergraduate students).
For example, one can obtain the set of all students in a department. Other set operations, such as
intersection and set difference can be performed as well.
As we noted earlier, we can perform operations on the results of queries. For example, if we
want to find the ID and salary for those instructors who have salary greater than $85,000, we
would perform the first two operations in our example above. First we select those tuples from
the instructor relation where the salary value is greater than $85,000 and then, from that result,
select the two attributes ID and salary, resulting in the relation shown in Figure 2.13 consisting
of the ID.
11
Module II
12
Downloaded from Ktunotes.in
Table of Contents
2
2.1 Relational Databases
The relational model for database management is a data model based on set theory. It was introduced by
Edgar F. Codd of IBM Research in 1970. The fundamental assumption of the relational model is that all
data are represented as mathematical n- ary relations, an n- ary relation being a subset of the Cartesian
product of n sets. This is collection of tables which is assigned a unique name. A row in the table
represents a relationship among a set of values.
Super key is defined in a relational model as a set of attributes that, taken collectively, to uniquely identify
a tuple in the relation. For eg: the social_security_no attribute of the relation employee is
3
sufficient to distinguish one employee from another. Thus social_security_no is a superkey for the relation
employee.
Candidate key: Superkeys with minimal subset is known as the candidate key. For eg: it is possible to combine the
attributes, employee_id & organization_name to form a superkey. But the social_security_no is sufficient to
distinguish the two employees. Thus social_security_no is a candidate key.
Primary key is used to denote the candidate key that is chosen by the database designer to uniquely identify a
tuple in a relation.
A foreign key is an attribute or set of attributes of a relation say R(R) such that the value of each
attribute in this set is that of a primary key of the relation S(S).
The operations of relational model can be categorized into retrievals and updates. There are three basic
update operations on relations: (a) Insert (b) Delete and (c) Update (or Modify). The update operations
and the effect of these operations on the various constraints specified on the relational database schema
are explained below:
Insert: The Insert operation is used to insert a new tuple or tuples in a relation. The Insert operation
provides a list of attribute values for a new tuple that is to be inserted into a relation. Insert can violate any
of the following four types of constraints (entity integrity and referential integrity constraints):
Domain constraints can be violated if an attribute value doesn’t exist in the corresponding
domain.
4
Unique Key constraints can be violated if the key value of the new tuple already exists in another
tuple in the relation.
Entity integrity can be violated if the primary key of the new tuple is null.
Referential integrity can be violated if the value of any foreign key in the new tuple refers to a
non- existing tuple in the referenced relation.
If an Insert operation violates one or more constraints, the DBMS rejects the insertion and informs
the user about the reason for rejection.
Delete: The Delete operation is used to delete tuples and the user has to provide a condition on the
attributes of the relation to select the tuple(s) to be deleted. The Delete operation can violate only referential
integrity, if the tuple being deleted is referenced by the foreign keys from other tuples in the database. If a
Delete operation violates any referential integrity constraints, then the DBMS rejects the deletion and
informs the user about the reason for rejection. Then the user will have following two options:
Attempt to cascade the deletion by deleting tuples that reference the tuple being deleted.
Modify the referencing attribute values that cause the violation; each value is set to null or
changed to reference another valid tuple. The tuple can then be deleted without violating the
referential integrity.
Update: The Update operation is used to change the values of one or more attributes in a tuple (or tuples)
of some relation. The user has to specify a condition on the attributes of the relation to select the tuple (or
tuples) to be modified. Updating a non- key (neither primary key nor foreign key) attribute causes no constraint
violations normally, except for the domain constraint violation if the new value of the attribute doesn’t exist
in the corresponding domain. Modifying a primary key value is similar to deleting one tuple and inserting
another in its place. Hence, the constraint violations discussed under both Insert and Delete could happen
in case of modifying a primary
5
key value. If a foreign key attribute is modified, the referential integrity constraint violation could happen if
the new value is not null and refers to a non- existing tuple in the referenced relation.
UPDATE the SALARY of the EMPLOYEE tuple with SSN = ‘909880123’ to 35000.
The relational algebra operations are used to specify the retrieval operations in relational model. Hence,
knowledge about relational algebra and its operations are required to understand retrieval operations in
relational model. Relational algebra and operations are discussed in the sub section 2.2 below.
The relational algebra is a procedural query language. It consists of collection of operations to manipulate
relations. It is similar to normal algebra (as in 2+3*x- y), except we use relations as values instead of
numbers, and the operations and operators are different. Relations in relational algebra are seen as sets of
tuples, so we can use basic set operations. The fundamental operations in relational algebra are select,
project, union, set difference, Cartesian product and rename. There are several other operations namely,
set intersection, natural join, division and assignment.
However, relational algebra is not used as a query language in actual DBMSs (SQL is used instead). The
inner, lower level operations of a relational DBMS are, or are similar to, relational algebra operations.
Knowledge about relational algebra is required to understand query execution and optimization in a
relational DBMS. Also, some advanced SQL queries require explicit relational algebra operations, most
commonly outer join. Difference between relational algebra and SQL are:
SQL is a declarative query language, which means that you tell DBMS what you want, but not
how to get it. Relational Algebra is procedural (like procedural programming languages C++ or
Java), which means that you have to state, step by step, exactly how the result should be
calculated. Relational algebra is (more) procedural than SQL. Actually, relational algebra is
mathematical expressions.
Relations are seen as sets of tuples, which means that no duplicates are allowed. SQL behaves
differently in some cases (refer the SQL keyword distinct).
6
The data model also includes a set of operations to manipulate the data in addition to defining the database
structure and constraints. The retrieval operations in relational model are based on relational algebra
operations which are divided into two groups:
Set Operations – UNION, INTERSECTION, SET DIFFERENCE, and CARTESIAN PRODUCT.
Operations developed specifically for relational databases – SELECT, PROJECT, JOIN, etc.
The relation algebra operations have their own symbols. A summary of relational algebra operations and
their symbols are given in the table below. The set of relational algebra operations {σ, ∏, U, - , X} is a
complete set; that is, any of the other relational algebra operations can be expressed as a sequence of
operations from this set.
7
where the symbol σ (sigma) is used to denote the SELECT operator, and the selection condition is a
Boolean expression specified on the attributes of relation R. R could be just the name of a database
relation or a relational algebra expression whose result is a relation. The result of the SELECT
operation will be another relation having the same attributes (same degree) as R. The number of tuples
in the resulting relation is always less than or equal to the number of tuples in R. That is |σc(R)|
|R|. The Boolean expression specified in
<selection condition> can be a number of clauses of the following forms:
<attribute name><comparison op><constant value> OR
<attribute name><comparison op><attribute name>
where <attribute name> is the name of an attribute of R, <comparison op> is normally one of the
operators =, , <, , > and , and <constant value> is a constant value from the attribute domain.
Clauses can be connected by the logical operators (OR) and (AND) and NOT (!) to form a
general selection condition.
The fraction of tuples selected by a selection condition is referred to as the selectivity of the condition.
The SELECT operation is commutative; that is a cascade of SELECT operations on a relation gives
the same result irrespective of the order of different SELECT operations. Also, a cascade of SELECT
operations can be combined into a single SELECT operation with conditions of all individual
SELECT operations connected using AND logical operators.
Example: Select all employees who either work in MKTG department and make over $25000 per year,
or work in PROJ department and make over $30000.
σ(DEPT=’MKTG’ AND SALARY>25000) OR (DEPT=’PROJ’ AND SALARY>30000)(EMPLOYEE)
8
The result of the PROJECT operation will be another relation having only the attributes specified in
<attribute list> and in the same order as they appear in the list. Hence, its degree is equal to the
number of attributes in <attribute list>. If the attribute list includes only non- key attributes of R,
duplicates tuples are likely to occur and the PROJECT operation removes any duplicate tuples. If the
projection list includes a super key of R, the resulting relation will have the same number of tuples as
R. So, the number of tuples in the resulting relation of PROJECT operation is always less than or
equal to the number of tuples in R. That is |∏a(R)| |R|. The PROJECT operation is not commutative.
Example: List each employee’s first name, last name and salary.
∏FNAME, LNAME, (EMPLOYEE)
SALARY
R=PUQ has tuples drawn from P & Q such that R={t|t € P or t € Q} and max(|P|,|Q|)≤|R|≤
|P|+|Q|.
Example: List the name & address of all students and faculty members. It can be written as:
∏FNAME,ADDRESS(STUDENT) U ∏ FNAME,ADDRESS(FACULTY).
9
operation is:
R=P∩Q such that R={t|t € P and t € Q} and 0≤|R|≤min(|P|,|Q|)
Example: List the names and address of all customers who have both a loan and an account. It can
be written as: ∏FNAME,ADDRESS(LOAN) ∩ ∏FNAME,ADDRESS(ACCOUNT).
Example: List the names and address of all customers (account holders) who doesn’t have a loan. It
can be written as: ∏FNAME,ADDRESS(ACCOUNT) - ∏FNAME,ADDRESS(LOAN).
10
g) RENAME(ℓ) Operation
The RENAME operation is used to rename either the relation name, or the attribute names, or both.
The general RENAME operation when applied to a relation R of degree n is denoted by:
ρs(B1, B2,… , Bn)(R) or ρs(R) or ρ (B1, B2,… , Bn)(R)
where the symbol ρ (rho) is used to denote the RENAME operator, S is the new relation name, and
B1,B2,… Bn are the new attribute names. The first expression renames both the relation and its
attributes; the second renames the relation only; and the third renames the attributes only. If the
attributes of R are (A1,A2,… ..,An) in that order, then each Ai is renamed as Bi.
h) JOIN Operation
The JOIN operation, denoted by , is used to combine related tuples from two relations into single
tuples. In general, the result of JOIN operation of two relations P(A1,A2,… ..,An) and Q(B1,B2,… .,Bm)
is a relation R with n+m attributes R(A1,A2,… ..,An, B1,B2,… .,Bm), in that order. The join operation
retrieves all tuples from the Cartesian product of the two relations with a matching condition. The
main difference between Cartesian product and JOIN operation is that the JOIN retrieves only
combinations of tuples from two relations satisfying the join condition whereas Cartesian product
retrieves all combinations of tuples. The join condition is specified on attributes from the two relations
and is evaluated for each combination of tuples. The resulting relation R will have tuples between zero
(no tuples matching the join condition) and |P|*|Q| (all tuples match the join condition). Ie. 0≤|R|≤|P|
*|Q|. The general syntax of the JOIN operation is:
P <join condition> Q
Example: Get the name of the manager of each department in the organization. List the department
name and first & last names of department manager. This can be written as:
DEPT_MGR <- DEPARTMENT MGRSSN=SSN EMPLOYEE
RESULT <- ∏DNAME,FNAME,LNAME(DEPT_MGR)
EQUIJOIN: Join operations that involve join conditions with equality comparisons (=) only are
called EQUIJOINs. This is the most common JOIN operation. The result of EQUIJOINs will have
one or more pairs of attributes that have identical values in every tuple.
NATURAL JOIN (*): As explained above, the resulting relation of EQUIJOINs will always
11
have one or more pairs of attributes with identical values. A new operation called NATURAL
JOIN, denoted by *, is used to eliminate the duplicate superfluous attributes in an EQUIJOIN
condition. The standard definition of NATURAL JOIN requires that the two join attributes (or
each pair of join attributes) have the same name in both relations. If this is not the case, rename
operation has to be applied first to convert the name of the join attribute of one relation same as
that of the join attribute in second relation.
SEMI JOIN: The Simi Join operation joins two relations and only keeps the attributes in the first
relation. This is used to reduce the network traffic in a distributed environment. Only the joining
attribute of one relation is sent to the other site where the second
relation exists. A semijoin operation P A=BQ, where A and B are domain compatible attributes of P
and Q respectively , produces the same result as the relational algebra expression ∏P(P
A=B Q). Note that the semijoin operation is not commutative.
OUTER JOIN: In the normal JOIN operation, only the tuples that matches the join condition
are selected. Tuples without a matching (or related) tuple are eliminated from the JOIN result.
Tuples with null in the JOIN attributes are also eliminated. A set of operations called OUTER
JOINs provides the ability to include all the tuples including unmatched tuples in either or both
relations in the result of the join. The outer join combines the unmatched tuple in one of the
relation with an artificial tuple for the other relation; all attributes of artificial tuple set to NULL.
LEFT OUTER JOIN ( ): The LEFT OUTER JOIN is written as P Q where P and Q are
relations. The result of the left outer join is the set of all combinations of tuples in P and
12
Q that have matching values on their common attribute, in addition to tuples in P that have no
matching tuples in Q.
Example: Get a list of all departments & managers including all employees under each
department.
FULL OUTER JOIN ( ): The FULL OUTER JOIN in effect combines the results of the
13
RT503 – DBMS – Module 2- Relational Databases
left and right outer joins. The full outer join is written as P Q where P and Q are
relations. The result of the full outer join is the set of all combinations of tuples in P and Q that
have matching values on their common attribute, in addition to tuples in P that have no matching
tuples in Q and tuples in Q that have no matching tuples in P.
Example: Retrieve the names of employees who work on all the projects that Mr. George K
John works on.
RT503 – DBMS – Module 2- Relational Databases
Let r(R) and s(S) be relations. Let . The relation r ÷ s is a relation on schema R –
S. A tuple t is in r ÷ s if for every tuple ts in s there is a tuple tr in r satisfying both of the following:
These conditions say that the portion of a tuple is in if and only if there
are tuples with the portion and the portion in for every value of the portion in
relation .
j) Assignment Operation ( )
It works like assignment in a programming language.
Example: Assign the result of a PROJECT operation to a relation RESULT. RESULT <-
∏DNAME,FNAME,LNAME(DEPT_MGR)
Relatioal Calculus simply means ‘calculating with relations’. Relational calculus is a query system wherein
queries are expressed as variables and formulas on these variables. In relational calculus, a query is
expressed as a formula consisting of a number of variables and an
RT503 – DBMS – Module 2- Relational Databases
expression involving these variables. Formula describes the properties of the result relation to be obtained.
If the variables represent the tuples from the specified relations, then the query or formula is referred as
‘tuple relational calculus’. If the variables represent values drawn from specified domains, then the query
or formula is referred as ‘domain relational calculus’. The relational calculus is based on predicate
(stands for the property) calculus which uses the following primitive symbols:
- AND
- OR
- Negation
- Implication
- Equivallence
- Universal quantifier (For All)
- Existential quantifier (For Some)
In English, we may read this equation as “the set of all tuples t such that there exists a tuple s in the
relation borrow for which the values of t and s for the cname attribute are equal, and the value of s for the
amount attribute is greater than 1200.”
2. Find all customers having a loan from the SFU branch, and the cities in which they live:
2. Find all customers who have a loan for an amount greater than $1200.
3. Find all customers having a loan from the SFU branch, and the city in which they live.
4. Find all customers having a loan, an account or both at the SFU branch.
5. Find all customers who have an account at all branches located in Brooklyn.
3/8/2017 18
SQL (Structured Query Language) is a database sublanguage for querying and modifying relational
databases. It was developed by IBM Research in the mid 70' s and standardized by ANSI in 1986. In the
relational model, data is stored in structures called relations or tables. Each table has one or more
attributes or columns that describe the table. In relational databases, the table is the fundamental building
block of a database application.
SQL language consists of two categories of statements:
Data Definition Language (DDL) used to create, alter and drop schema objects such as tables and
indexes, and
Data Manipulation Language (DML) used to manipulate the data within those schema objects.
Oracle' s SQL*Plus is a command line tool that allows a user to type SQL statements to be executed directly against
an Oracle database. SQL*Plus commands allow a user to do the following activities:
Enter, edit, store, retrieve, and run SQL statements
List the column definitions for any table
Format, perform calculations on, store, and print query results in the form of reports
Access and copy data between SQL databases
Oracle Database Objects
TABLESPACE - A Tablespace is the physical space (in fact, a physical data file in the hard
disk) where tables are stored.
TABLE - Consists of ROWS of data separated into different COLUMNS.
VIEW – A virtual relation for specific user group.
INDEX – A search index created for a table to improve query performance.
TRIGGER – A sub program stored in the database that gets executed automatically on a data
manipulation event on a table.
SEQUENCE – A database object for creating sequential numbers for use as a primary key, for
example.
PROCEDURE / FUNCTION – A sub program stored in the database for a specific application
use.
PACKAGE – A collection of stored sub programs (procedures or functions) in the database.
USER – Stores the user name of the current session.
SYSDATE – Stores the current system date and time.
3/8/2017 19
1) CREATE TABLE
The CREATE TABLE Statement creates a new database table.
Syntax:
CREATE TABLE <table name> (column1 data type (size), column2 data type (size),
column3 data type (size),… … … );
Constraints
Constraint specifications add additional restrictions on the contents of the table. They are automatically
enforced by the DBMS.
a) Primary Key: Specifies the columns that uniquely identify a row in a table. One or more
columns could be part of a primary key.
Syntax:
CREATE TABLE <table name> (column1 data type (size) PRIMARY KEY,
column2 data type (size), column3 data type (size),… … … );
3/8/2017 20
e) CHECK: specifies a user defined constraint, known as a check condition. The CHECK
specifier is followed by a condition enclosed in parentheses.
Syntax:
CREATE Table <table name> (column1 data type (size) CHECK (Condition), column2
data type (size), column3 data type (size),… … … );
2) ALTER TABLE
The ALTER TABLE statement is used to change the table definition.
To add new column
ALTER TABLE <table name> ADD column name datatype (size);
To add Primary key
ALTER TABLE <table name> ADD Column name PRIMARY KEY;
To add Foreign key
ALTER TABLE <table name> ADD CONSTRAINT <constraint name>
FOREIGN KEY (column name) REFERENCES <table name> (column name);
To remove a Foreign key constraint
ALTER TABLE <table name> DROP CONSTRAINT <constraint name>
To rename a table
ALTER TABLE <table name> RENAME TO < new table name>
3) CREATE INDEX
The CREATE INDEX statement allows the creation of an index for an already existing relation or
table.
Syntax:
CREATE [unique] Index <name of index>
3/8/2017 21
4) DROP TABLE
The DROP TABLE statement removes a previously created table and its description from the catalog.
Syntax:
DROP TABLE <table- name>
5) DROP INDEX
The DROP INDEX statement removes a previously created index from the database. Syntax:
DROP INDEX <index- name>
6) CREATE USER
The CREATE USER statement creates a new database user. Syntax:
CREATE USER <user name> IDENTIFIED BY <password>
7) ALTER USER
The ALTER USER statement is used to change the password of a database user. Syntax:
ALTER USER <user name> IDENTIFIED BY <new password>
8) CREATE SEQUENCE
The CREATE SEQUENCE statement is used to create a sequence to be used in DML statements.
Syntax:
CREATE SEQUENCE <sequence name>
START WITH <start sequence number>
MAXVALUE <maximum value of sequence number>
INCREMENT BY <increment value>
NOCYCLE;
The DML statements are used to manipulate (store and maintain) the data in database tables.
Common DML statements are:
o INSERT
o UPDATE
3/8/2017 22
Operators in SQL
Arithmetic operators
+,- ,*,/
Logical operators
=,!= or <>,>,>=,<,<=
Relational Operators
AND, OR, NOT
Other Operators
a) LIKE - The LIKE operator implements a pattern match comparison, that is, it matches a string
value against a pattern string containing wild- card characters.
The wild- card characters for LIKE are percent - - '% ' and underscore - - ' _' . Underscore matches any
single character. Percent matches zero or more characters.
Example: SELECT * FROM <table name> WHERE <column name> LIKE ‘abc% ’
b) NOT LIKE - It does not matches a string value against a pattern string containing wild- card
characters.
Example: SELECT * FROM <table name> WHERE <column name> NOT LIKE ' abc% '
c) BETWEEN - The BETWEEN operator implements a range comparison, that is, it tests
whether a value is between two other values.
Example: SELECT * FROM <table name> WHERE qty BETWEEN 50 and 500
d) NOT BETWEEN- It tests whether a value is not between two other values.
Example: SELECT * FROM <table name> WHERE qty NOT BETWEEN 50 and 500
e) IN - The IN operator implements comparison to a list of values, that is, it tests whether a value
matches any value in a list of values
Example: SELECT * FROM <table name> WHERE city IN (' Rome',' Paris')
f) NOT IN - It tests whether a value does not matches any value in a list of values
Example: SELECT * FROM <table name> WHERE city NOT IN (' Rome','
Paris')
1. lNSERT
3/8/2017 23
Examples:
INSERT statement with direct values
2. UPDATE
Updating allows us to change some values in a tuple without necessarily changing all. where clause
of update statement may contain any construct legal in a where clause of a select statement
(including nesting). A nested select within an update may reference the relation that is being updated.
As before, all tuples in the relation are first tested to see whether they should be updated, and the
updates are carried out afterwards.
Example: Update the date of birth of employee 1090 to 15- Oct- 1979.
UPDATE EMPLOYEE
SET DATE_OF_BIRTH = ’15- OCT- 1979’
WHERE EMPLOYEE_ID = 1090
Note: WHERE clause can take different forms as listed in SELECT… above, including SUB QUERY.
3. DELETE
Delete Specific rows
DELETE FROM <table name> [WHERE predicate]
Delete all rows
DELETE * FROM <table name>
Example: Delete the record of the employee with employee id 1091.
DELETE FROM EMPLOYEE WHERE EMPLOYEE_ID = 1091
Note: WHERE clause can take different forms as listed in SELECT… above, including SUB QUERY.
4. SELECT
3/8/2017 24
ORDER BY Clause
This clause is for retrieving rows in the sorted order.
Example: SELECT a from <t> where a LIKE ‘s% ’ ORDER BY a;
Group by Clause
This clause group rows based on distinct values that exist for specified columns.
Having Clause
This is used in conjunction with Group By clause. Filter the groups created by the group by clause.
Example: Write a SQL statement to get the list of departments having average salary less than the average
salary of the company. Include Department ID and Average Salary of Dept. in the list. Exclude
departments with less than 3 employees and order the list in the descending order of the department
average salary.
3/8/2017 25
3/8/2017 26
3/8/2017 27
Group/Aggregate functions
SUM () gives the total of all the rows, satisfying any conditions, of the given column, where
the given column is numeric.
AVG () gives the average of the given column.
MAX () gives the largest figure in the given column.
MIN () gives the smallest figure in the given column.
COUNT() gives the number of rows satisfying the conditions.
COUNT(*) gives the number of rows in the table.
3/8/2017 28
Note: All the above functions can be used in SQL statements (both on SELECT columns and in
WHERE clause).
UNION
The UNION command is used to select related information from two tables. When using the UNION
command all selected columns need to be of the same data type. With UNION, only distinct values are
selected.
SQL Statement 1
UNION
SQL Statement 2
Statement to get list of all female employees or employees in department 1 without any duplicates
INTERSECTION
The INTERSECT command is used to find the intersection of two tables.
SQL Statement 1
INTERSECT
3/8/2017 29
INTERSECT ALL
The INTERSECT ALL command is equal to the INTERSECT command, except that
INTERSECT ALL selects all values.
MINUS
The MINUS command returns rows from first select statement minus the rows from second select
statement.
Statement to get list of female employees excluding those belong to department 1
SELECT EMPLOYEE_ID, FIRST_NAME, LAST_NAME FROM EMPLOYEE WHERE
SEX = ‘F’
MINUS
SELECT EMPLOYEE_ID, FIRST_NAME, LAST_NAME
FROM EMPLOYEE WHERE DEPARTMENT_ID = 1
3/8/2017 30
2.4.3 Views
The DML statements given above help the users to manipulate the data stored in the ‘conceptual’ relations
in the database. Such conceptual relations are sometimes referred to as base relations. For security and
other concerns, sometimes it may be necessary to prevent direct access to the base relations by users. It is
possible to create ‘virtual relations’ from base relations for different groups of users, exposing only the
relevant data elements to different groups of users. This is possible in SQL by creating views. A relation in
a view is virtual since no corresponding physical relation exists. A view represents a different perspective
of a base relation or relations. The general syntax of create view statement is:
Create view <view name> as <query expression>
Views are not stored, since the data in the base relations may change. But, the definition of a view in a
create view statement is stored in the system catalog. Whenever, such a virtual relation defined by a view is
referred by a query, the view is recomputed to refresh the tuples in the virtual relation.
Example: Pay_Rate of an employee is confidential in nature and hence not all users are permitted to see
the Pay_Rate of an employee. So, DBA can create a view EMP_VIEW to prevent viewing Pay_Rate by
normal users and give access to this view rather than giving direct access to the EMPLOYEE table.
EMP_VIEW can be defined as:
Create view EMP_VIEW as (select Emp_Id, Name, Skill from EMPLOYEE)
A view can be deleted from the database by using drop view statement. Syntax of drop view statement is:
Drop view <view name>
The tuples in a view can be updated only under certain conditions. When the tuple in a view doesn’t map to
a single tuple in the base relation, the update operation may not be possible. Views that involve a join may
not be updatable, if they do not include the primary keys of the base relations.
3/8/2017 31
3/8/2017 32