Structured Query Language: Thomas Heinis T.heinis@imperial - Ac.uk WP - Doc.ic - Ac.uk/theinis
Structured Query Language: Thomas Heinis T.heinis@imperial - Ac.uk WP - Doc.ic - Ac.uk/theinis
Query Language
Thomas Heinis
[email protected]
wp.doc.ic.ac.uk/theinis
SQL
SQL (Structured Query Language) is the most prominent rela?onal
database language, used in more than 99% of database applica?ons.
SQL supports schema crea?on and modica?on; data inser?on, retrieval,
update and dele?on; constraints, indexing of aIributes, transac?ons, data
access control (authorisa?on), plus lots more.
Most SQL implementa?ons also provide one or more procedural languages
for wri?ng procedures (stored procedures) that execute SQL statements
within a RDBMS. Such procedures can be called directly by the user, by
client programs, by other stored procedures or called automa?cally by the
database when certain events happen (triggers), for example aNer a tuple
is updated.
Well do a tour of SQL (well a subset of SQL)
2
SQL
R S
R union S
R S
R intersect S
R - S
R except S
aIributes(R)
condi?on(R)
R S
R, S or R cross join S
R S
R natural join S
R condi?on S
R join S on condi?on
3
Vocabulary
Rela?on Algebra
SQL
Comment
Rela?on
Table
Rela?onal
Expression
Views
Tuple
Row
AIribute
Column
Domain
Type
SQL Gotchas
Standards
Duplicates
Nulls
AIributes do not need to have a value, they can be null. nulls can be
used to indicate a missing value, a value that is not known, a value
that is private, etc. Nulls are best avoided.
Booleans
Types
Most SQL implementa?ons support a wide range of types including:
int, smallint, real,
Most DBs support a variety of integer and oa?ng point types.
double precision,
Ranges are implementa?on dependent. The usual arithme?c
oat(n), numeric(p,d), operators are available.
decimal(p,d)
char, char(n),
varchar(n),
clob/text, ...
Strings can xed length (padded with spaces), varying length (upto
n), or unlimited length clob/text. The concatena?on operator is ||.
string like pa4ern performs paIern matching where paIern can
include _ for any character and % for zero or more chars, e.g. X like
B% matches any strings star?ng with B. Also similar to for regular
expression matches.
Bits, bytes, and binary large objects (blobs). ONen used for audio,
images, movies, les, etc.
6
Types
A few more types:
boolean
...
Each RDBMS has a long list of addi?onal types for currency, xml,
geo-spa9al data, CAD data, mul9-media, etc. Some also support
user-dened types.
7
x and y
x or y
not x
TRUE
TRUE
TRUE
TRUE
FALSE
TRUE
unknown
TRUE
FALSE
unknown
TRUE
unknown
unknown
unknown
FALSE
FALSE
TRUE
FALSE
unknown
FALSE
FALSE
FALSE
FALSE
TRUE
FALSE
FALSE
TRUE
TRUE
TRUE
FALSE
FALSE
TRUE
8
x and y
x or y
not x
TRUE
TRUE
TRUE
TRUE
FALSE
TRUE
unknown
unknown
TRUE
FALSE
TRUE
FALSE
FALSE
TRUE
FALSE
unknown
TRUE
unknown
TRUE
unknown
unknown
unknown
unknown
unknown
unknown
unknown
FALSE
FALSE
unknown
unknown
FALSE
TRUE
FALSE
TRUE
TRUE
FALSE
unknown
FALSE
unknown
TRUE
FALSE
FALSE
FALSE
FALSE
TRUE
Nulls
SQL aIributes can have the special value null. There are many interpreta?ons for null,
including:
Missing
There is a value, but were not en?tled to record the value, e.g.
an unlisted phone number.
1
0
Nulls
We need to understand the implica?ons of nulls on arithme?c and
comparisons including for joins (see later).
Arithme?c
Comparisons
Any comparison involving a null will result in unknown, e.g. x>y where y is
null will result in unknown.
null is not a constant value like true and cant be used in comparisons.
To test if an aIribute y is null, use y is null, or y is not null
null will never match any other value (even null itself), unless we explicitly
use is null or is not null.
11
Queries
Probably the most used and most complex statement in
SQL is the select statement which is used to query
(retrieve) data from a database.
select supports all the rela?onal operators as well as
sor?ng, grouping and aggregate func?ons. The rela?on
produced by a select is normally returned to the user or
client program, but can be used as a subquery in
expressions.
Example:
movie(?tle, year, length, genre, studio, producer)
To return all aIributes, use * for the projected aIributes.
12
Renaming AIributes
We can rename aIributes (and use expressions) in the projec?on part
of a select with the as keyword. Renaming is useful if we have clashing
aIribute names that represent dierent things, or we want to carry
out set opera?ons on rela?ons with diering aIribute names. as can
also be used to rename rela?ons as well see.
Example:
For readability or to disambiguate aIributes we can prex the rela?on name,
e.g:
select movie.?tle as name, movie.length/60 as hours
from movie
where movie.studio=fox and movie.year>1990
13
Sor?ng results
In contrast to rela?onal algebra, SQL's select statement can sort the tuples in the resul?ng
rela?on. This is achieved by adding an order by clause at the end of the select.
Example: movie(?tle, year, length, genre, studio, producer)
This will sort the resul?ng tuples rst by year in descending order, then by ?tle in ascending
order. Note: we can use all the aIributes of movie (e.g. year), not just those in the
projec?on. So the order of evalua?on is from, where, order, select (FWOS).
14
select *
from movie , cas?ng
select *
from movie natural join cas?ng
15
Theta Join
Theta join is performed with join and an on condi?on or a using aIribute list.
Example:
This is equivalent to
?tle, year, name(movie.producer=cas?ng.name(movie cas?ng))
using can be used if we want to join on specic aIributes, e.g.
select ?tle
from movie join cas?ng using (?tle, year)
Renaming Rela?ons
To form a query over two tuples from the same rela?on (self-join), we list the rela?on twice
(i.e. perform a cartesian product on itself and rename one or both of the listed rela?ons
using the as keyword. Renamed rela?ons are known as correla>on names.
Example:
We can also use correla?on names to give us a shorter name to use in other parts of the
query:
17
Mul?-rela?on Joins
We can join has many rela?ons as we like. The evalua?on is carried out leN to right unless
we use parentheses. Note: in prac?ce a query op?miser rewrites all queries for performance
while maintaining the seman?cs of the query.
Example:
movie(?tle, year, length, genre, studio, producer)
cas?ng(?tle, year, name)
studio(name, address, boss)
18
19
More Joins
le@rela>on JOIN-OPERATOR rightrela>on
inner join returns tuples when there is at least one match in both rela?ons. This corresponds
to the join weve seen in rela?onal algebra.
le@ outer join is like inner join but includes all tuples from the leN rela?on, even if there are
no matches in the right rela?on. Nulls are used for missing values.
right outer join is like inner join but includes all tuples from the right rela?on, even if there
are no matches in the leN rela?on. Nulls are used for missing values.
full outer join is like inner join but includes all umatched tuples from both rela?ons. Nulls
are used for missing values.
The joins above can be natural joins (joined by matching aIributes) or theta joins (joined by
a condi?on).
20
Natural Outerjoins
Although natural and theta joins are oNen whats required, there are occasions when wed
like to retain tuples that dont match, outerjoins give us this capability.
Examples:
4
7
null
null
null
null
null
null
2
1
Theta Outerjoins
We can also perform theta outerjoins using le@ outer join, right outer join, or full outer join
along with an on condi?on.
Example:
22
Elimina?ng Duplicates
Unlike rela?onal algebra, SQL queries can poten?ally produce duplicate tuples. We can
eliminate them by adding the keyword dis>nct aNer the keyword select. Recall that union,
intersect and except eliminate duplicates unless all is used.
Example:
23
Aggregate func?ons
The aggregate func?ons sum, avg, min, max and count can be used in a projec?on list to
calculate a single value, either from the whole resul?ng rela?on or from a part of it - see
grouping later). The parameter is typically an a4ribute but can be an expression. Fortunately,
nulls are excluded from these calcula?ons.
count(dis>nct a4ribute) counts the number of dis?nct values of the aIribute. count(*) is an
aggregate func?on that counts the number of tuples in a rela?on or group (including nulls)
Example:
Note: this query will produce a rela?on with a single tuple with 5 aIributes.
24
Grouping
The select statement can group the tuples in a resul?ng rela?on. This is achieved by providing
a list of grouping aIributes in a group by clause. If aggregate func?ons are used in the
projec?on list they are applied to each group.
Example: select
department,
count(*) as professors,
sum(salary) as totalsalary,
avg(salary) as averagesalary,
min(age) as youngest,
max(age) as oldest
from employee
where posi?on = Professor
group by department
order by totalsalary desc
This query will produce a rela?on with one tuple for each department. The results
are sorted in descending order by totalsalary.
25
Example
select department,
count(*) as professors,
sum(salary) as totalsalary,
avg(salary) as averagesalary,
min(age) as youngest,
max(age) as oldest
from employee
where posi?on = Professor
group by department
order by totalsalary desc
department
professors
totalsalary
averagesalary
youngest
oldest
physics
30
3000000
100000
35
68
compu?ng
25
2000000
80000
40
65
360000
90000
50
62
...
materials
26
This query will produce a rela?on with one tuple for each department that has at
least 10 professors. The results are sorted in descending order by totalsalary.
27
Example
select department,
count(*) as professors,
sum(salary) as totalsalary,
avg(salary) as averagesalary,
min(age) as youngest,
max(age) as oldest
from employee
where posi?on = Professor
group by department
having count(*)>=10
order by totalsalary desc
department
professors
totalsalary
averagesalary
youngest
oldest
physics
30
3000000
100000
35
68
compu?ng
25
2000000
80000
40
65
360000
90000
50
62
...
materials
28
Subqueries
One of the most powerful features of selects is that they can be used as subqueries in
expressions by enclosing them in parentheses i.e. (subquery). SQL supports scalar, set and
rela?ons subqueries.
Scalar
subquery
Set
subquery
Rela>on
subquery
Scalar subquery
A select that produces a single value. Scalar subqueries can be used in any expression, e.g. in
projec?on lists, in where and having clauses. Are oNen selects with a single aggregate
func?on.
Example:
select ?tle,
(select count(name)
from cas?ng
where cas?ng.?tle=movie.?tle) as numactors
from movie;
30
select ?tle
from movie
where studio in (select name from studio
where address like C%)
We can extend the approach to tuple values enclosed in parentheses:
select name
from cas?ng
where (?tle, year) not in
(select ?tle, year from movie where genre=sf)
32
Example2:
Alterna9ve:
select ?tle
from movie join studio on studio.name=movie.studio
where studio.address=C%
select name
from cas?ng
where (?tle, year) not in
(select ?tle, year from movie where genre=sf)
Alterna9ve:
select name
from cas?ng join movie using (?tle, year)
where genre<>sf
33
select ?tle
from movie m1
where year < any(select year from movie m2
where m2.?tle=m1.?tle)
All requires:
select name
from employee
where salary <> all(select salary from employee
where posi?on=Professor)
34
Rela?on subqueries
The exists and not exists func?ons with a subquery argument can be used to test whether a
rela?on is empty (has no tuples) or not.
The not unique func?on can be used to test whether a rela?on has duplicates, or hasnt
duplicates with unique.
Example:
movie(?tle, year, length, genre, studio, producer)
cas?ng(?tle, year, name)
studio(name, address, boss)
select ?tle
from movie m1
where not exists(select * from movie m2
where m2.?tle=m1.?tle and m2.year<>m1.year
)
35
36
37
Common AIributes
Main Rela?ons
Q. List all sta who do not have a College or Department email address, sort results by
lastname
Q. List all sta with the same lastname, show names of sta and their namesake(s)
38
student
xstudentsta
xcoursesta
course
class
degree book
xcourseclass
xcoursebook
Each rela?on has several temporal views (named rela9onal expressions), e.g.:
coursecurr - courses for current year
course0910 - courses for academic year 2009-2010, similar for 0809 etc.
39
Examples 1
Q. List all sta who do not have a College or Department email address, sort results by
lastname.
Q. List all sta with the same lastname, show names of sta and their namesake(s)
40
Solu?ons 1
Q. List all sta who do not have a College or Department email address, sort by
lastname.
select id, lastname, email
from stacurr
where not(email like '%imperial.ac.uk' or email like '%doc.ic.ac.uk')
order by lastname
Q. List all sta with the same lastname, show names of sta and their namesake(s)
4
1
Examples 2
Q. List all books recommended for courses taught by Prof Kelly, similar to:
42
Solu?on 2
Q. List all books recommended for courses taught by Prof Kelly, similar to:
select c.code, c.?tle, xb.ra?ng, b.?tle, b.authors, b.publisher,
b.code
from stacurr s join xcoursestacurr xc on s.id=xc.stad
join coursecurr c on xc.courseid=c.id
join xcoursebookcurr xb on c.id=xb.courseid
join bookcurr b on xb.bookid=b.id
where s.lastname='Kelly'
order by c.code, xb.ra?ng
4
3
Examples 3
Q List courses being taken by student with login rf611, similar to
44
Solu?on 3a
Q List courses being taken by student with login rf6111, similar to
select c.code, c.?tle, xc.examcode, c.term, c.popes?mate, c.classes
from studentcurr s join xstudentclasscurr xa on s.id=xa.studen?d
join classcurr ca on xa.classid=ca.id
join xcourseclasscurr xc on ca.id=xc.classid
join coursecurr c on xc.courseid=c.id
where s.login='rf611'
order by c.code
4
5
Solu?on 3b
Q List courses being taken by student with login rf611, similar to
select c.code, c.?tle, xc.examcode, c.term, c.popes?mate, c.classes
from studentcurr s join xstudentclasscurr xa on s.id=xa.studen?d
join xcourseclasscurr xc on xa.classid=xc.classid
join coursecurr c on xc.courseid=c.id
where s.login='rf611'
order by c.code
4
6
Examples 4
Q. List all PPT tutors and their PPT tutees (role=PPT) order sta and students:
Tutor lastname, tutor rstname, Tutee lastname, rstname suitably sorted.
Q. List all PPT tutors and how many PPT tutees they have, suitably sorted.
47
Solu?ons 4
Q. List all PPT tutors and their PPT tutees (role=PPT) order sta and students:
Tutor lastname, tutor rstname, Tutee lastname, rstname suitably sorted.
Q. List all PPT tutors and how many PPT tutees they have, suitably sorted.
Examples 5
Q. List all tutoring roles
Q. List all tutoring roles and how many tutors there are for each role
49
Solu?ons 5
Q. List all tutoring roles
select role
from xstudentstacurr
group by role
order by role
Q. List all tutoring roles and how many tutors there are for each role
5
0