DBMS Module-3-Notes - SQL
DBMS Module-3-Notes - SQL
Module – III
SQL
Overview
IBM SEQUEL (Structured English QUEry Language) language developed as part of System R
project at the IBM San Jose Research Laboratory. Renamed as SQL (Structured Query
Language)
• ANSI and ISO standard SQL:
• SQL-86 or SQL1
• SQL-89
• SQL-92 or SQL2
• SQL: 1999 (language name became Y2K compliant) or SQL3
• SQL: 2003
SQL is a comprehensive database language, it has statements for the data definition, update
and query. It has facilities for defining views on the database, for specifying security and
authorization, for defining integrity constraints and for specifying transaction controls.
Commercial systems offer most, if not all, SQL2 features, plus varying feature sets from later
standards and special proprietary features.
The concepts of relational database schema were incorporated into SQL2 in order to group
together tables and other constructs that belong to the same database application. A schema is
identified by a schema name and includes an authorization identifier. Authorization identifier
indicates the user or account who owns the schema It also includes the descriptors for each
element in the schema.
Schema elements include tables, constraints, views, domains and other constructs (like
authorization grants that describe the schema. Elements of schema can be defined at the time
of creation of schema or later.
• CREATE SCHEMA
• Specifies a new database schema by giving it a name. Example:
• create schema company authorization KUMAR;
• All users are not authorized to create schemas and its elements
• The privileges to create the schema, tables and other construct can explicitly be granted to
the relevant users by DBA.
• SQL2 uses the concept of catalog.
• Catalog is a named collection of schemas in an SQL environment.
• A catalog contains a special schema called INFORMATION_SCHEMA, which provides
information on all the schemas in the catalog and all the element descriptors of all the
schema in the catalog to authorized users.
• Integrity constraints can be defined for the relations and between the relations of same
schema.
• Schemas within the same catalog can also share certain elements such as domain
definitions.
• The CREATE TABLE command is used to specify a new base relation by giving its name, and
specifying each of its attributes and constraints.
• The attributes are specified first by giving their name, their data types and constraints for
the attributes like NOT NULL, CHECKS etc. specified on each attribute.
• The key, entity integrity and referential integrity constraints can be specified within the
CREATE TABLE command after the attributes declaration, or can be added later using the
ALTER command.
• An SQL relation is defined using the create table command:
CREATE TABLE R(A1 D1, A2 D2, ..., An Dn,
(integrity-constraint_1),
...,
(integrity-constraint_k))
• R is the name of the relation
• Each Ai is an attribute name in the schema of relation R
• Di is the data type of values in the domain of attribute Ai
UNIQUE (DNAME),
FOREIGN KEY (MGRSSN) REFERENCES EMPLOYEE (SSN));
• Schema name can be explicitly attached with the relation name separated by period.
• CREATE TABLE COMPANY. EMPLOYEE…….
• EMPLOYEE table becomes part of Schema COMPANY
• Numeric: integer number of various sizes and floating numbers of various precision
o INTEGER or INT: Integer (a finite subset of the integers that is machine-dependent)
without any decimal part.
o SMALLINT: Small integer (a machine-dependent subset of the integer domain type)
without any decimal part.
o REAL, DOUBLE precision: Floating point and double-precision floating point
numbers, with machine-dependent precision.
o FLOAT (n): Floating point number, with user-specified precision of at least n digits.
o NUMERIC(i, j) or DECIMAL(I, j) or DEC(I, j): Fixed point number, with user-specified
precision of i as total no. of digits, with j digits to the right of decimal point.
• Character: consists of sequence of character either of fixed length or varying length, default
length of character string is 1
o CHAR(n) or CHARACTER(n): Fixed length character string, with user-specified length
n.
o VARCHAR(n) or CHAR VARYING(n) or CHARACTER VARYING(N): Variable length
character strings, with user-specified maximum length n.
o For fixed length strings (CHARACTER), a shorter string is padded with blank spaces
which are ignored at time of comparison (lexicographic order)
o CHARACTER LARGE OBJECT (CLOB) for large text values (documents)
• Bit-string: consists of sequence of bits, either fixed length or varying length, default length
of bit string is 1. BLOB is also available to specify columns that have large binary values, such
as images.
• Date: It has ten positions, in the form YYYY-MM-DD, components are YEAR, MONTH, DAY
• Time: it has at least eight positions, in the form HH:MM:SS, components are HOUR,
MINUTE, SECOND
o Only valid date & time is allowed, comparison operators can be used.
o TIME(i): Made up of hour: minute: second plus i additional digits specifying fractions
of a second format is hh:mm:ss ii...i
• TIMESTAMP: A timestamp includes the DATE and TIME fields, plus a minimum of six
positions for decimal fractions of seconds and an optional WITH TIME ZONE qualifier.
o TIME WITH TIME ZONE: this data type includes an additional six positions for
specifying the displacement from the standard universal time zone, which is in the
range +13:00 to 12:59 in the units of HOURS:MINUTES
o If WITH TIME ZONE is not included, the default is the local time zone for the SQL
session.
• INTERVAL: Interval data type specifies an interval – a relative value that can be used to
increment or decrement an absolute value of a date, time or timestamp.
o Intervals are qualified to be either YEAR/MONTH or DAY/TIME intervals.
o It can be positive or negative when added to or subtracted from an absolute value,
the result is an absolute value.
Domains in SQL
• A domain can be declared and the domain can be used with several attributes.
• CREATE DOMAIN ENO_TYPE AS CHAR(9)
• ENO_TYPE can be used in place of CHAR(9) with SSN, ESSN, MGRSSN.
• Data type of domain can be changed that will be reflected for the numerous attributes in
the schema and improves the schema readability.
• A DEFAULT clause is used to declare a default value for an attribute in absence of actual
value.
o Whenever an explicit value is not provided for the attribute, default value will be
appended.
• A clause CHECK is used to restrict the domain values for an attribute.
• We can specify CASCADE, SET NULL or SET DEFAULT on referential integrity constraints
(foreign keys).
• An option must be qualified with either ON DELETE or ON UPDATE.
• Possible options for referential triggered actions:
o ON DELETE SET DEFAULT
o ON DELETE SET NULL
o ON DELETE CASCADE
o ON UPDATE CASCADE
o ON UPDATE SET DEFAULT
o ON UPDATE SET NULL
• A constraint name is used to identify a particular constraint in case the constraint must be
dropped or modified later.
• Giving names to constraints is optional.
• Name of the constraint should be unique within a schema.
Drop commands
• DROP command is used to remove an element & its definition
• DROP SCHEMA: To drop schema
• DROP TABLE: To drop table
• The relation (or schema) can no longer be used in queries, updates, or any other commands
since its description no longer exists
• There are two DROP behavior options CASCADE and RESTRICT
• A CASCADE option is used to remove schema and its all tables, views and all other elements
Examples:
• DROP SCHEMA COMPANY CASCADE;
o Schema and its all element are dropped.
• DROP TABLE EMPLOYEE CASCADE;
o Table and its all element are dropped.
• If RESTRICT option is used instead of CASCADE
o A schema is dropped only if it has no elements, otherwise error will be shown.
o A table is dropped only if it is not referenced in any constraint by any other table.
ALTER command
• The definition of a base table can be changed by using ALTER TABLE command.
• The various possible options include
o adding a column
o dropping a column
o changing the definition of column
o adding and dropping the table constraints
• ALTER TABLE command is used to add an attribute to one of the base relations
• The new attribute will have NULLs in all the tuples of the relation right after the command is
executed; hence, the NOT NULL constraint is not allowed for such an attribute
Example:
• ALTER TABLE EMPLOYEE ADD JOB CHAR(12);
• The database users must enter a value for the new attribute JOB for each EMPLOYEE tuple.
• This can be done using the UPDATE command or by DEFAULT clause.
• ALTER TABLE command is also used to drop an attribute from one of the base relations.
• To drop a column, an option CASCADE or RESTRICT should be chosen for drop behaviour.
• ALTER TABLE EMPLOYEE DROP JOB;
• If CASCADE, all relations and views that reference the column are dropped automatically
from the schema along with the column.
• ALTER TABLE command is also used to modify an attribute of one of the base relations.
• ALTER TABLE EMPLOYEE ALTER MGRSSN DROP DEFAULT;
• ALTER TABLE EMPLOYEE ALTER MGRSSN SET DEFAULT 999;
Query 0: Retrieve the birthdate and address of the employee whose name is ‘John B. Smith’.
SELECT BDATE, ADDRESS
FROM EMPLOYEE
WHERE FNAME=’John’ AND MINIT=’B.’
AND LNAME=’Smith’;
Π BDATE, ADDRESS (σFNAME=’John’ AND MINIT=’B’ AND LNAME=’Smith’(Employee))
Query 1: Retrieve the name and address of all employees who work for the ‘Research’
department.
Query 2: For every project located in ‘Stafford’, list the project number, the controlling
department number, and the department manager’s last name, address, and birthdate.
Query 1A: Retrieve the name and address of all employees who work for the ‘Research’
department.
SELECT FNAME, LNAME, ADDRESS
FROM EMPLOYEE, DEPARTMENT
WHERE DNAME=’Research’ AND
EMPLOYEE.DNUMBER = DEPARTMENT.DNUMBER
Q1B
Query 8: For each employee, retrieve the employee’s name, and the name of his or her
immediate supervisor.
SELECT E.FNAME, E.LNAME, S.FNAME, S. LNAME
FROM EMPLOYEE E S
WHERE E.SUPERSSN=S.SSN
• The alternate relation names E and S are called aliases.
• E and S can be thought as two different copies of EMPLOYEE; E represents employees in
role of supervisees and S represents employees in role of supervisors
Unspecified WHERE-clause
• A missing WHERE-clause indicates no condition; hence, all tuples of the relations in the
FROM-clause are selected
• If more than one relation is specified in the FROM-clause and there is no join condition,
then the CARTESIAN PRODUCT of tuples is selected
Q1O
SELECT SSN, DNAME FROM EMPLOYEE, DEPARTMENT
• It is extremely important not to overlook specifying any selection and join conditions in the
WHERE-clause; otherwise, incorrect and very large relations may result
Use of * (Asterisk)
• To retrieve all the attribute values of the selected tuples, a * is used, which stands for all the
attributes
• SQL usually treats a table not as a set but rather as a multiset, duplicate tuples can appear
more than once in a table, and in the result of a query.
• SQL does not automatically eliminate duplicate tuples in the result of queries because:
o Duplicate elimination is expensive.
o The user may want to see the duplicates in the result of query.
o For an aggregate function, elimination of tuples is not desired.
• An SQL table with a Primary key is restricted to being a set, since the key value must be
distinct in each tuple.
• Keyword DISTINCT can be used in SELECT clause if duplicates have to be eliminated in the
query result.
SELECT SALARY FROM EMPLOYEE
or SELECT ALL SALARY FROM EMPLOYEE
• The resulting relations of these set operations are sets of tuples; duplicate tuples are
eliminated from the result
• The set operations apply only to union compatible relations ; the two relations must have
the same attributes and the attributes must appear in the same order
• If duplicates have to be retained
o union operation (UNION ALL)
o set difference (MINUS ALL or EXCEPT ALL)
o intersection (INTERSECT ALL)
• (All may not work in every tool)
Query 4: Make a list of all project numbers for projects that involve an employee whose last
name is ‘Smith’ as a worker or as a manager of the department that controls the project.
(SELECT DISTINCT PNUMBER
FROM PROJECT, DEPARTMENT, EMPLOYEE
WHERE DNUM=DNUMBER AND MGRSSN=SSN
AND LNAME=’Smith’)
UNION
(SELECT DISTINCT PNUMBER
FROM WORKS_ON, EMPLOYEE
WHERE ESSN=SSN AND LNAME=’ Smith’)
Make a list of all project numbers for projects that involve an employee whose last name is
‘Smith’ as a worker and as a manager of the department that controls the project.
(SELECT DISTINCT PNUMBER
FROM PROJECT, DEPARTMENT, EMPLOYEE
WHERE DNUM=DNUMBER AND MGRSSN=SSN
AND LNAME=’Smith’)
INTERSECT
(SELECT DISTINCT PNUMBER
FROM WORKS_ON, EMPLOYEE
WHERE ESSN=SSN AND LNAME=’ Smith’)
Make a list of all project numbers for projects that involve an employee whose last name is
‘Smith’ as a manager of the department that controls the project but not as a worker.
(SELECT DISTINCT PNUMBER
FROM PROJECT, DEPARTMENT, EMPLOYEE
WHERE DNUM=DNUMBER AND MGRSSN=SSN
AND LNAME=’Smith’)
MINUS
(SELECT DISTINCT PNUMBER
FROM WORKS_ON, EMPLOYEE
WHERE ESSN=SSN AND LNAME=’ Smith’)
Query 12: Retrieve all employees whose address is in Houston, Texas. (Here, the value of
the ADDRESS attribute must contain the substring ‘Houston,Texas’.)
SELECT FNAME, LNAME
FROM EMPLOYEE
WHERE ADDRESS LIKE
‘%Houston,Texas%’
Query 12A: Retrieve all employees who were born during the 195Os. (Here, ‘5’ must be the
8th character of the string according to our format for date, so the BDATE value is 5’, with each
underscore as a place holder for a single arbitrary character.
SELECT FNAME, LNAME
FROM EMPLOYEE
WHERE BDATE LIKE ‘_ _ _ _ _ 195_’;
(or ‘%195_’)
Arithmetic Operators
• The standard arithmetic operators ‘+’, ‘-’, ‘*’, ‘/’ (for addition, subtraction, multiplication,
and division, respectively) can be applied to numeric values in an SQL query result.
Query 13: Show the resulting salaries if every employee working on the ‘ProductX’ project is
given a 10% raise.
SELECT FNAME, LNAME, 1.1*SALARY AS INCREASED_SALARY
FROM EMPLOYEE, WORKS_ON, PROJECT
WHERE SSN=ESSN AND PNO=PNUMBER AND PNAME=’ProductX’
Comparison Operators
Query 14: Retrieve all the employees in department 5 whose salary is between Rs. 30,000 and
Rs. 40,000.
SELECT *
FROM EMPLOYEE
WHERE (SALARY BETWEEN 30000 AND 40000) AND DNO = 5;
SELECT *
FROM EMPLOYEE
WHERE (SALARY >= 30000) AND (SALARY <= 40000) AND DNO = 5;
• The ORDER BY clause is used to sort the tuples in a query result based on the values of
some attribute(s)
• The default order is in ascending order of values.
• The keyword DESC can be used if descending order is required; the keyword ASC can be
used to explicitly specify ascending order, even though it is the default
Query 15: Retrieve a list of employees and the projects each works in, ordered by the
employee’s department, and within each department ordered alphabetically by employee
last name, first name.
NULLS in SQL
• SQL allows queries that check if a value is NULL (missing or undefined or not applicable)
• SQL uses IS or IS NOT to compare NULLs because it considers each NULL value distinct from
other NULL values, so equality comparison is not appropriate.
Query 14: Retrieve the names of all employees who do not have supervisors.
FROM EMPLOYEE
WHERE SUPERSSN IS NULL
• Note: If a join condition is specified, tuples with NULL values for the join attributes are not
included in the result
Renaming Attributes
• Any attribute which appears in the result can be renamed by adding the qualifier AS
followed by the desired new name.
• AS construct can be used for both attribute names and relation names and can be used in
both SELECT and FROM clauses.
Select ename as EMPNAME
from emp as EMPLOYEE
Nested Queries
• Some queries require that existing values in database be fetched and then used in a
comparison condition.
• A complete SELECT query, called a nested query, can be specified within the WHERE-clause
of another query, called the outer query
Query 1: Retrieve the name and address of all employees who work for the ‘Research’
department.
SELECT FNAME, LNAME, ADDRESS
FROM EMPLOYEE
WHERE DNO IN
(SELECT DNUMBER
FROM DEPARTMENT
WHERE DNAME=’Research’)
• The nested query selects the number of the ‘Research’ department
• The outer query select an EMPLOYEE tuple if its DNO value is in the result of nested query
• The comparison operator IN compares a value v with a set (or multi-set) of values V, and
evaluates to TRUE if v is one of the elements in V
• In general, we can have several levels of nested queries
Query 4A: Make a list of all project numbers for projects that involve an employee whose last
name is ‘Smith’ as a worker or as a manager of the department that controls the project.
SELECT DISTINCT PNUMBER FROM PROJECT WHERE PNUMBER IN
(SELECT PNUMBER
FROM PROJECT, DEPARTMENT, EMPLOYEE
Query 16: Retrieve the name of each employee who has a dependent with the same name
and same gender as the employee.
SELECT E.ENAME
FROM EMPLOYEE AS E
WHERE E.SSN IN
(SELECT ESSN
FROM DEPENDENT
WHERE E.GENDER = D.GENDER AND
E.NAME=DEP_NAME)
Single Block Query: A query written with nested SELECT... FROM... WHERE... blocks and using
the = or IN comparison operators can always be expressed as a single block query.
Q16A
SELECT E.ENAME
FROM EMPLOYEE E, DEPENDENT D
WHERE E.SSN = D.ESSN AND
E.ENAME = D.DEP_NAME AND
E.GENDER = D.GENDER;
Query 16B: Retrieve the name of each employee who has a dependent with the same name
as the employee.
SELECT E.ENAME FROM EMPLOYEE E
WHERE EXISTS
(SELECT *
FROM DEPENDENT D
WHERE E.SSN=D.ESSN AND
E.ENAME=D.DEP_NAME AND
E.GENDER = D.GENDER)
• The correlated nested query retrieves all DEPENDENT tuples related to an EMPLOYEE tuple.
If none exist, the EMPLOYEE tuple is selected
• EXISTS is necessary for the expressive power of SQL
Q7. List the names of managers who have at least one dependent.
SELECT FNAME, LNAME
FROM EMPLOYEE
WHERE EXISTS
(SELECT *
FROM DEPENDENT
WHERE SSN=ESSN)
AND EXISTS
(SELECT *
FROM DEPARTMENT
WHERE SSN=MGRSSN);
• The original SQL as specified for SYSTEM R also had a CONTAINS comparison operator,
which is used in conjunction with nested correlated queries
• This operator was dropped from the language, possibly because of the difficulty in
implementing it efficiently
• Most implementations of SQL do not have operator CONTAINS
• The CONTAINS operator compares two sets of values, and returns TRUE if one set contains
all values in the other set (reminiscent of the division operation of algebra).
Query 3: Retrieve the name of each employee who works on all the projects controlled by
department number 5.
SELECT FNAME, LNAME FROM EMPLOYEE
WHERE ((SELECT PNO, ESSN
FROM WORKS_ON
WHERE SSN=ESSN)
CONTAINS
(SELECT PNUMBER
FROM PROJECT
WHERE DNUM=5))
• The second nested query, which is not correlated with the outer query, retrieves the project
numbers of all projects controlled by department 5
• The first nested query, which is correlated, retrieves the project numbers on which the
employee works, which is different for each employee tuple because of the correlation
Query 13: Retrieve the social security numbers of all employees who work on project number
1, 2, or 3.
SELECT DISTINCT ESSN
FROM WORKS ON
WHERE PNO IN (1,2,3)
Renaming Attributes
Query 8A: For each employee, retrieve the employee’s name, and the name of his or her
immediate supervisor.
SELECT E.NAME AS Supervisee_name,
S.NAME AS Superviser_name,
FROM EMPLOYEE AS E, EMPLOYEE AS S
WHERE E.SUPERSSN=S.SSN
Joining Tables
• We can specify a “joined relation” in the FROM-clause and it looks like any other
relation but is the result of a join
• It allows the user to specify different types of joins (regular THETA JOIN, NATURAL JOIN,
LEFT OUTER JOIN, RIGHT OUTER JOIN, CROSS JOIN, etc)
Query 1: Retrieve the name and address of all employees who work for the ‘Research’
department.
Q1: SELECT FNAME, LNAME, ADDRESS
FROM EMPLOYEE, DEPARTMENT
WHERE DNUMBER=DNO AND DNAME=’Research’;
could be written as:
Q1: SELECT FNAME, LNAME, ADDRESS
FROM (EMPLOYEE JOIN DEPARTMENT ON DNUMBER=DNO)
WHERE DNAME=’Research’;
Further,
Q1: SELECT FNAME, LNAME, ADDRESS
FROM EMPLOYEE, DEPARTMENT
WHERE DNAME=’Research’ AND DNUMBER=DNO
Q2 Query 2: For every project located in ‘Stafford’, list the project number, the controlling
department number, and the department manager’s last name, address, and birthdate.
could be written as follows; this illustrates multiple joins in the joined tables
Aggregate Functions
Query 19: Find the sum of the salary of all employees, the maximum salary, the minimum
salary, and the average salary among employees.
SELECT SUM(SALARY), MAX(SALARY), MIN(SALARY), AVG(SALARY) FROM EMPLOYEE;
Query 20: Find the sum of the salary of all employees, the maximum salary, the minimum
salary, and the average salary among employees who work for the ‘Research’ department.
SELECT SUM(SALARY), MAX(SALARY), MIN(SALARY), AVG(SALARY)
FROM EMPLOYEE, DEPARTMENT
WHERE DNO=DNUMBER AND
DNAME= ‘Research’
Or
Query 22: Retrieve the total number of employees in the Research Department.
SELECT COUNT(*)
FROM EMPLOYEE, DEPARTMENT
WHERE DNO=DNUMBER AND DNAME=‘Research’;
Query 5: List the names of all employees with two or more dependents.
SELECT LNAME, FNAME
FROM EMPLOYEE
WHERE (SELECT COUNT (*) FROM DEPENDENT
WHERE SSN = ESSN) >=2;
Grouping
• SQL has a GROUP BY-clause for specifying the grouping attributes, which must also appear
in the SELECT-clause
• In many cases, we want to apply the aggregate functions to subgroups of tuples in a relation
• Each subgroup of tuples consists of the set of tuples that have the same value for the
grouping attribute(s)
• Then aggregate function is applied to each subgroup independently
Query 20: For each department, retrieve the department number, the number of employees
in the department, and their average salary.
SELECT DNO, COUNT (*), AVG (SALARY)
FROM EMPLOYEE
GROUP BY DNO;
• The EMPLOYEE tuples are divided into groups--each group having the same value for the
grouping attribute DNO
• The COUNT and AVG functions are applied to each such group of tuples separately
• The SELECT-clause includes only the grouping attribute and the functions to be applied on
each group of tuples
• A join condition can be used in conjunction with grouping
Query 25: For each project, retrieve the project number, project name, and the number of
employees who work on that project.
SELECT PNUMBER, PNAME, COUNT (*)
FROM PROJECT, WORKS_ON
WHERE PNUMBER=PNO
GROUP BY PNUMBER, PNAME;
• In this case, the grouping and functions are applied after the joining of the two relations
The Having-clause
• Sometimes we want to retrieve the values of these functions for only those groups that
satisfy certain conditions
• The HAVING-clause is used for specifying a selection condition on groups (rather than on
individual tuples)
Query 26: For each project on which more than two employees work, retrieve the project
number, project name, and the number of employees who work on that project.
SELECT PNUMBER, PNAME, COUNT (*)
FROM PROJECT, WORKS_ON
WHERE PNUMBER=PNO
GROUP BY PNUMBER, PNAME
HAVING COUNT (*) >2;
Query 27: For each project, retrieve the project number, project name, and the number of
employees from department 5 who work on that project.
SELECT PNUMBER, PNAME, COUNT (*)
FROM PROJECT, WORKS_ON, EMPLOYEE
Query 28: For each department that has more than five employees, retrieve the department
number, and the number of its employees who are earning more than 40000.
A query is evaluated by first applying the WHERE-clause, then GROUP BY and HAVING, and
finally the SELECT-clause.
Question Bank
1. What are the different attribute data types and domains in SQL? Explain.
2. Explain the following commands in SQL, with examples: DROP, CREATE, ALTER, UPDATE,
ROLLBACK, CHECK, EXISTS/NOT EXISTS
3. Explain the various constraints in SQL with syntax and examples.
4. Discuss the INSERT, DELETE and UPDATE statements in SQL with examples.
5. Explain JOIN operations in SQL with syntax and examples.
6. Explain Nested Queries with examples.
7. Explain the usage of GROUP BY and HAVING clause with syntax and examples.
8. List and explain all the aggregate functions in SQL with syntax and suitable examples.
9. Questions on Formulation of SQL queries, given any database.