Advanced Database Design and Implementation: Lesson 04 SQL
Advanced Database Design and Implementation: Lesson 04 SQL
Implementation
Lesson 04 SQL
Dr Eric Umuhoza
eric.umuhoza@gmail.com
Database Systems (Atzeni, Ceri, Paraboschi, Torlone)
Chapter 4: SQL
SQL
• The name is an acronym for Structured Query Language
• Far richer than a query language: both a DML and a DDL
• History:
– First proposal: SEQUEL (IBM Research, 1974)
– First implementation in SQL/DS (IBM, 1981)
• Standardization crucial for its diffusion
– Since 1983, standard de facto
– First standard, 1986, revised in 1989 (SQL-89)
– Second standard, 1992 (SQL-2 or SQL-92)
– Third standard, 199 (SQL-3 or SQL-99)
• Most relational systems support the base functionality of the
standard and offer proprietary extensions
McGraw-Hill and Atzeni, Ceri, Paraboschi, Torlone 1999
2
Database Systems (Atzeni, Ceri, Paraboschi, Torlone)
Chapter 4: SQL
Domains
• Domains specify the content of attributes
• Two categories
– Elementary (predefined by the standard)
– User-defined
Elementary domains, 1
• Character
– Single characters or strings
– Strings may be of variable length
– A Character set different from the default one can be used
(e.g., Latin, Greek, Cyrillic, etc.)
– Syntax:
character [ varying ] [ (Length) ]
[ character set CharSetName ]
– It is possible to use char and varchar, respectively for
character and character varying
Elementary domains, 2
• Bit
– Single boolean values or strings of boolean values (may be
variable in length)
– Syntax:
bit [ varying ] [ (Length) ]
• Exact numeric domains
– Exact values, integer or with a fractional part
– Four alternatives:
numeric [ ( Precision [, Scale ] ) ]
decimal [ ( Precision [, Scale ] ) ]
integer
smallint
McGraw-Hill and Atzeni, Ceri, Paraboschi, Torlone 1999
5
Database Systems (Atzeni, Ceri, Paraboschi, Torlone)
Chapter 4: SQL
Elementary domains, 3
• Approximate numeric domains
– Approximate real values
– Based on a floating point representation
float [ ( Precision ) ]
double precision
real
Elementary domains, 4
• Temporal instants
date
time [ ( Precision) ] [ with time zone ]
timestamp [ ( Precision) ] [ with time zone ]
• Temporal intervals
interval FirstUnitOfTime [ to LastUnitOfTime ]
– Units of time are divided into two groups:
• year, month
• day, hour, minute, second
Schema definition
• A schema is a collection of objects:
– domains, tables, indexes, assertions, views, privileges
• A schema has a name and an owner (the authorization)
• Syntax:
create schema [ SchemaName ]
[ [ authorization ] Authorization ]
{ SchemaElementDefinition }
Table definition
• An SQL table consists of
– an ordered set of attributes
– a (possibly empty) set of constraints
• Statement create table
– defines a relation schema, creating an empty instance
• Syntax:
create table TableName
(
AttributeName Domain [ DefaultValue ] [ Constraints ]
{, AttributeName Domain [ DefaultValue ] [ Constraints ] }
[ OtherConstraints ]
)
• Syntax:
default < GenericValue | user | null >
Intra-relational constraints
• Constraints are conditions that must be verified by every
database instance
• Intra-relational constraints involve a single relation
– not null (on single attributes)
– unique: permits the definition of keys; syntax:
• for single attributes:
unique, after the domain
• for multiple attributes:
unique( Attribute {, Attribute } )
– primary key: defines the primary key (once for each table;
implies not null); syntax like unique
– check: described later
Inter-relational constraints
• Constraints may take into account several relations
– check: described later
– references and foreign key permit the definition of
referential integrity constraints; syntax:
• for single attributes
references after the domain
• for multiple attributes
foreign key ( Attribute {, Attribute } )
references …
– It is possible to associate reaction policies to violations of
referential integrity
Reaction policies
for referential integrity constraints
• Reactions operate on the internal table, after changes to the
external table
• Violations may be introduced (1) by updates on the referred
attribute or (2) by row deletions
• Reactions:
– cascade: propagate the change
– set null: nullify the referring attribute
– set default: assign the default value to the referring
attribute
– no action: forbid the change on the external table
• Reactions may depend on the event; syntax:
on < delete | update >
< cascade | set null | set default | no action >
McGraw-Hill and Atzeni, Ceri, Paraboschi, Torlone 1999
16
Database Systems (Atzeni, Ceri, Paraboschi, Torlone)
Chapter 4: SQL
Schema updates
• Two SQL statements:
– alter (alter domain ..., alter table …)
– drop
drop < schema | domain | table | view | assertion >
ComponentName [ restrict | cascade ]
• Examples:
– alter table Department
add column NoOfOffices numeric(4)
– drop table TempTable cascade
Relational catalogues
• The catalog contains the data dictionary, the description of the
data contained in the data base
• It is based on a relational structure (reflexive)
• The SQL-2 standard describes a Definition_Schema (composed
of tables) and an Information_Schema (composed of views)
SQL queries
• SQL queries are expressed by the select statement
• Syntax:
select AttrExpr [[ as ] Alias ] {, AttrExpr [[ as ] Alias ] }
from Table [[ as ] Alias ] {, [[ as ] Alias ] }
[ where Condition ]
• The three parts of the query are usually called:
– target list
– from clause
– where clause
• The query considers the cartesian product of the tables in the
from clause, considers only the rows that satisfy the condition
in the where clause and for each row evaluates the attribute
expressions in the target list
McGraw-Hill and Atzeni, Ceri, Paraboschi, Torlone 1999
21
Database Systems (Atzeni, Ceri, Paraboschi, Torlone)
Chapter 4: SQL
Example database
EMPLOYEE FirstName Surname Dept Office Salary City
Mary Brown Administration 10 45 London
Charles White Production 20 36 Toulouse
Gus Green Administration 20 40 Oxford
Jackson Neri Distribution 16 45 Dover
Charles Brown Planning 14 80 London
Laurence Chen Planning 7 73 Worthing
Pauline Bradshaw Administration 75 40 Brighton
Alice Jackson Production 20 46 Toulouse
• Result:
Remuneration
45
80
• Result:
Attribute expressions
• Find the monthly salary of the employees named White:
select Salary / 12 as MonthlySalary
from Employee
where Surname = ‘White’
• Result:
MonthlySalary
3.00
Table aliases
• Find the names of the employees and the cities in which they
work (using an alias):
select FirstName, Surname, D.City
from Employee, Department D
where Dept = DeptName
Predicate conjunction
• Find the first names and surnames of the employees who work
in office number 20 of the Administration department:
select FirstName, Surname
from Employee
where Office = ‘20’ and
Dept = ‘Administration’
• Result:
FirstName Surname
Gus Green
Predicate disjunction
• Find the first names and surnames of the employees who work
in either the Administration or the Production department:
select FirstName, Surname
from Employee
where Dept = ‘Administration’ or
Dept = ‘Production’
• Result:
FirstName Surname
Mary Brown
Charles White
Gus Green
Pauline Bradshaw
Alice Jackson
Operator like
• Find the employees with surnames that have ‘r’ as the second
letter and end in ‘n’:
select *
from Employee
where Surname like ‘_r%n’
• Result:
Duplicates
• In relational algebra and calculus the results of queries do not
contain duplicates
• In SQL, tables may have identical rows
• Duplicates can be removed using the keyword distinct
Joins in SQL-2
• SQL-2 introduced an alternative syntax for the representation of
joins, representing them explicitly in the from clause:
select AttrExpr [[ as ] Alias ] {, AttrExpr [[ as ] Alias ] }
from Table [[ as ] Alias ]
{ [ JoinType] join Table [[ as ] Alias ] on JoinConditions }
[ where OtherCondition ]
Left join
• Find the drivers with their cars, including the drivers without
cars:
select FirstName, Surname, Driver.DriverID
CarRegNo, Make, Model
from Driver left join Automobile on
(Driver.DriverID = Automobile.DriverID)
• Result:
FirstName Surname DriverID CarRegNo Make Model
Mary Brown VR 2030020Y ABC 123 BMW 323
Mary Brown VR 2030020Y DEF 456 BMW Z3
Charles White PZ 1012436B GHI 789 Lancia Delta
Marco Neri AP 4544442R NULL NULL NULL
Full join
• Find all the drivers and all the cars, showing the possible
relationships between them:
select FirstName, Surname, Driver.DriverID
CarRegNo, Make, Model
from Driver full join Automobile on
(Driver.DriverID = Automobile.DriverID)
• Result:
FirstName Surname DriverID CarRegNo Make Model
Mary Brown VR 2030020Y ABC 123 BMW 323
Mary Brown VR 2030020Y DEF 456 BMW Z3
Charles White PZ 1012436B GHI 789 Lancia Delta
Marco Neri AP 4544442R NULL NULL NULL
NULL NULL NULL BBB 421 BMW 316
Table variables
• Table aliases may be interpreted as table variables
• They correspond to the renaming operator ρ of relational
algebra
• Find all the same surname (but different first names) of an
employee belonging to the Administration department:
select E1.FirstName, E1.Surname
from Employee E1, Employee E2
where E1.Surname = E2.Surname and
E1.FirstName <> E2.FirstName and
E2.Dept = ‘Administration’
• Result:
FirstName Surname
Charles Brown
Ordering
• The order by clause, at the end of the query, orders the rows
of the result; syntax:
order by OrderingAttribute [ asc | desc ]
{, OrderingAttribute [ asc | desc ] }
• Extract the content of the AUTOMOBILE table in descending order
of make and model:
select *
from Automobile
order by Make desc, Model desc
• Result: CarRegNo Make Model DriverID
GHI 789 Lancia Delta PZ 1012436B
DEF 456 BMW Z3 VR 2030020Y
ABC 123 BMW 323 VR 2030020Y
BBB 421 BMW 316 MI 2020030U
McGraw-Hill and Atzeni, Ceri, Paraboschi, Torlone 1999
41
Database Systems (Atzeni, Ceri, Paraboschi, Torlone)
Chapter 4: SQL
Aggregate queries
• Aggregate queries cannot be represented in relational algebra
• The result of an aggregate query depends on the consideration
of sets of rows
• SQL-2 offers five aggregate operators:
– count
– sum
– max
– min
– avg
Operator count
• count returns the number of rows or distinct values; syntax:
count(< * | [ distinct | all ] AttributeList >)
• Find the number of employees:
select count(*)
from Employee
• Find the number of different values on the attribute Salary for all
the rows in EMPLOYEE:
select count(distinct Salary)
from Employee
• Find the number of rows of EMPLOYEE having a not null value on
the attribute Salary:
select count(all Salary)
from Employee
McGraw-Hill and Atzeni, Ceri, Paraboschi, Torlone 1999
43
Database Systems (Atzeni, Ceri, Paraboschi, Torlone)
Chapter 4: SQL
• Result: SumSalary
125
• Result: MaxLondonSal
80
Group by queries
• Queries may apply aggregate operators to subsets of rows
• Find the sum of salaries of all the employees of the same
department:
select Dept, sum(Salary)as TotSal
from Employee
group by Dept
Dept Salary
Administration 45
Production 36
Administration 40
Distribution 45
Planning 80
Planning 73
Administration 40
Production 46
Group predicates
• When conditions are on the result of an aggregate operator, it is
necessary to use the having clause
• Find which departments spend more than 100 on salaries:
select Dept
from Employee
group by Dept
having sum(Salary) > 100
• Result:
Dept
Administration
Planning
where or having?
select Dept
from Employee
where Office = ‘20’
group by Dept
having avg(Salary) > 25
select TargetList
from TableList
[ where Condition ]
[ group by GroupingAttributeList ]
[ having AggregateCondition ]
[ order by OrderingAttributeList ]
Set queries
• A single select cannot represent unions
• Syntax:
Intersection
• Find the surnames of employees that are also first names:
select FirstName as Name
from Employee
intersect
select Surname
from Employee
• equivalent to:
select E1.FirstName as Name
from Employee E1, Employee E2
where E1.FirstName = E2.Surname
Difference
• Find the surnames of employees that are not also first names:
select FirstName as Name
from Employee
except
select Surname
from Employee
Nested queries
• In the where clause may appear predicates that:
– compare an attribute (or attribute expression) with the result
of an SQL query; syntax:
ScalarValue Operator < any | all > SelectSQL
• any: the predicate is true if at least one row returned by
SelectSQL satisfies the comparison
• all: the predicate is true if all the rows returned by
SelectSQL satisfy the comparison
– use the existential quantifier on an SQL query; syntax:
exists SelectSQL
• the predicate is true if SelectSQL returns a non-empty
result
• The query appearing in the where clause is called nested query
Tuple constructor
• The comparison with the nested query may involve more than
one attribute
• The attributes must be enclosed within a pair of curved brackets
(tuple constructor)
• The previous query can be expressed in this way:
select *
from Person P
where (FirstName,Surname) not in
(select FirstName, Surname
from Person P1
where P1.TaxCode <> P.TaxCode)
Scope of variables
• Incorrect query:
select *
from Employee
where Dept in
(select DeptName
from Department D1
where DeptName = ‘Production’) or
Dept in (select DeptName
from Department D2
where D2.City = D1.City)
• The query is incorrect because variable D1 is not visible in the
second nested query
Insertions, 1
• Syntax:
insert into TableName [ (AttributeList) ]
< values (ListOfValues) | SelectSQL>
• Using values:
insert into Department(DeptName, City)
values(‘Production’,’Toulouse’)
• Using a subquery:
insert into LondonProducts
(select Code, Description
from Product
where ProdArea = ‘London’)
Insertions, 2
• The ordering of the attributes (if present) and of values is
meaningful (first value with the first attribute, and so on)
• If AttributeList is omitted, all the relation attributes are
considered, in the order in which they appear in the table
definition
• If AttributeList does not contain all the relation attributes, to the
remaining attributes it is assigned the default value (if defined)
or the null value
Deletions, 1
• Syntax:
delete from TableName [ where Condition ]
Deletions, 2
• The delete statement removes from the table all the tuples
that satisfy the condition
• The removal may produce deletions from other tables if a
referential integrity constraint with cascade policy has been
defined
• If the where clause is omitted, delete removes all the tuples
Updates, 1
• Syntax:
update TableName
set Attribute = < Expression | SelectSQL | null | default >
{, Attribute = < Expression | SelectSQL | null | default >}
[ where Condition ]
• Examples:
update Employee
set Salary = Salary + 5
where RegNo = ‘M2047’
update Employee
set Salary = Salary * 1.1
where Dept = ‘Administration’
McGraw-Hill and Atzeni, Ceri, Paraboschi, Torlone 1999
73
Database Systems (Atzeni, Ceri, Paraboschi, Torlone)
Chapter 4: SQL
Updates, 2
• Since the language is set oriented, the order of the statements is
important
update Employee
set Salary = Salary * 1.1
where Salary <= 30
update Employee
set Salary = Salary * 1.15
where Salary > 30
Assertions
• Assertions permit the definition of constraints outside of table
definitions
• Useful in many situations (e.g., to express generic inter-
relational constraints)
• An assertion associates a name to a check clause; syntax:
create assertion AssertionName check (Condition)
Views, 1
• Syntax:
create view ViewName [ (AttributeList) ] as SelectSQL
[ with [ local | cascaded ] check option ]
Views, 2
• SQL views cannot be mutually dependent (no recursion)
• The check option operates when a view content is updated
• Views can be used to formulate complex queries
– Views decompose the problem and produce a more
readable solution
• Views are sometimes necessary to express certain queries:
– queries that combine and nest several aggregate operators
– queries that make a sophisticated use of the union operator
select Dept
from SalaryBudget
where SalaryTotal = (select max(SalaryTotal)
from SalaryBudget)
select avg(NoOfOffices)
from DeptOffice
McGraw-Hill and Atzeni, Ceri, Paraboschi, Torlone 1999
81
Database Systems (Atzeni, Ceri, Paraboschi, Torlone)
Chapter 4: SQL
Access control
• Every component of the schema can be protected (tables,
attributes, views, domains, etc.)
• The owner of a resource (the creator) assigns privileges to the
other users
• A predefined user _system represents the database
administrator and has complete access to all the resources
• A privilege is characterized by:
– the resource
– the user who grants the privilege
– the user who receives the privilege
– the action that is allowed on the resource
– whether or not the privilege can be passed on to other users
Types of privilege
• SQL offers six types of privilege
– insert: to insert a new object into the resource
– update: to modify the resource content
– delete: to remove an object from the resource
– select: to access the resource content in a query
– references: to build a referential integrity constraint with
the resource (may limit the ability to modify the resource)
– usage: to use the resource in a schema definition (e.g., a
domain)