0% found this document useful (0 votes)
32 views

Structured Query Language: Thomas Heinis T.heinis@imperial - Ac.uk WP - Doc.ic - Ac.uk/theinis

This document provides an introduction to Structured Query Language (SQL). It states that SQL is used in over 99% of database applications and supports functions like schema creation/modification, data manipulation, constraints, indexing, and more. It also notes that most SQL implementations provide procedural languages for writing stored procedures that can execute SQL statements and be triggered by events. The document then gives examples of some basic SQL statements and concepts like joins, aggregation, and sorting query results.

Uploaded by

Dexter Fung
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Structured Query Language: Thomas Heinis T.heinis@imperial - Ac.uk WP - Doc.ic - Ac.uk/theinis

This document provides an introduction to Structured Query Language (SQL). It states that SQL is used in over 99% of database applications and supports functions like schema creation/modification, data manipulation, constraints, indexing, and more. It also notes that most SQL implementations provide procedural languages for writing stored procedures that can execute SQL statements and be triggered by events. The document then gives examples of some basic SQL statements and concepts like joins, aggregation, and sorting query results.

Uploaded by

Dexter Fung
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Structured

Query Language
Thomas Heinis

[email protected]
wp.doc.ic.ac.uk/theinis

SQL
SQL (Structured Query Language) is the most prominent rela?onal
database language, used in more than 99% of database applica?ons.
SQL supports schema crea?on and modica?on; data inser?on, retrieval,
update and dele?on; constraints, indexing of aIributes, transac?ons, data
access control (authorisa?on), plus lots more.
Most SQL implementa?ons also provide one or more procedural languages
for wri?ng procedures (stored procedures) that execute SQL statements
within a RDBMS. Such procedures can be called directly by the user, by
client programs, by other stored procedures or called automa?cally by the
database when certain events happen (triggers), for example aNer a tuple
is updated.
Well do a tour of SQL (well a subset of SQL)
2

Rela?onal Algebra to SQL


Rela?on Algebra

SQL

R S

R union S

R S

R intersect S

R - S

R except S

aIributes(R)

select aIributes from R

condi?on(R)

from R where condi?on

R S

R, S or R cross join S

R S

R natural join S

R condi?on S

R join S on condi?on
3

Vocabulary
Rela?on Algebra

SQL

Comment

Rela?on

Table

Tables are persistent rela?ons stored on disk.

Rela?onal
Expression

Views

Views are rela?ons based on other rela?ons. Views


are not normally stored nor updateable, unless
theyre materialised.

Tuple

Row

Some?mes called a record

AIribute

Column

Some?mes called a eld

Domain

Type

Types include char, int, oat, date, ?me

SQL Gotchas
Standards

Which SQL? Every SQL vendor supports a dierent subset of one of


the SQL standards plus their own extensions. Moving a database from
one vendor to another is non-trivial.

Duplicates

SQL is based on mul?-sets(bags) not sets. Rela?ons in SQL can have


duplicate tuples. Duplicate are best avoided.

Nulls

AIributes do not need to have a value, they can be null. nulls can be
used to indicate a missing value, a value that is not known, a value
that is private, etc. Nulls are best avoided.

Booleans

Booleans are based on three-valued logic (3VL). They can be true,


false or unknown!

Types
Most SQL implementa?ons support a wide range of types including:
int, smallint, real,
Most DBs support a variety of integer and oa?ng point types.
double precision,
Ranges are implementa?on dependent. The usual arithme?c
oat(n), numeric(p,d), operators are available.
decimal(p,d)
char, char(n),
varchar(n),
clob/text, ...

Strings can xed length (padded with spaces), varying length (upto
n), or unlimited length clob/text. The concatena?on operator is ||.
string like pa4ern performs paIern matching where paIern can
include _ for any character and % for zero or more chars, e.g. X like
B% matches any strings star?ng with B. Also similar to for regular
expression matches.

bit(n), byte(n), blob

Bits, bytes, and binary large objects (blobs). ONen used for audio,
images, movies, les, etc.
6

Types
A few more types:
boolean

Booleans are based on three-valued logic (3VL). They can be


true, false or unknown! See later for truth tables. Comparisons
operators include between, not between, in, not in. Examples:
age between 45 and 49 for age>=45 and age<=49
name not in (Fred, Jim, Alice)

date, ?me, ?mestamp

Dates and ?mes are specied like:


date 1994-02-25, >me 12:45:02,
>mestamp 1994-02-25 12:45:02 SQL supports date and ?me
expressions as well as ?mezones and intervals

...

Each RDBMS has a long list of addi?onal types for currency, xml,
geo-spa9al data, CAD data, mul9-media, etc. Some also support
user-dened types.
7

Truth Table for 3-valued Logic


x

x and y

x or y

not x

TRUE

TRUE

TRUE

TRUE

FALSE

TRUE

unknown

TRUE

FALSE

unknown

TRUE

unknown

unknown

unknown

FALSE

FALSE

TRUE

FALSE

unknown

FALSE

FALSE

FALSE
FALSE

TRUE

FALSE

You can complete the Truth


table with the following
mapping:
1 - TRUE
- unknown
0 - FALSE
x and y = min x, y
x or y = max x, y
not x = 1 x

FALSE

TRUE

TRUE
TRUE

FALSE

FALSE

TRUE
8

Truth Table for 3-valued Logic


x

x and y

x or y

not x

TRUE

TRUE

TRUE

TRUE

FALSE

TRUE

unknown

unknown

TRUE

FALSE

TRUE

FALSE

FALSE

TRUE

FALSE

unknown

TRUE

unknown

TRUE

unknown

unknown

unknown

unknown

unknown

unknown

unknown

FALSE

FALSE

unknown

unknown

FALSE

TRUE

FALSE

TRUE

TRUE

FALSE

unknown

FALSE

unknown

TRUE

FALSE

FALSE

FALSE

FALSE

TRUE

You can complete the Truth


table with the following
mapping:
1 - TRUE
- unknown
0 - FALSE
x and y = min x, y
x or y = max x, y
not x = 1 x

Nulls
SQL aIributes can have the special value null. There are many interpreta?ons for null,
including:

Missing

There is some value, but we dont know what it is at the


moment, e.g. missing birthdate

No value makes sense, e.g. spouses name for an unmarried


Not applicable
person
Withheld

There is a value, but were not en?tled to record the value, e.g.
an unlisted phone number.

1
0

Nulls
We need to understand the implica?ons of nulls on arithme?c and
comparisons including for joins (see later).



Arithme?c

Any arithme?c that involves a null will result in a null.


Note: In SQL, 0*y where y is null is null!

y-y is also null if y is null!

Comparisons

Any comparison involving a null will result in unknown, e.g. x>y where y is
null will result in unknown.
null is not a constant value like true and cant be used in comparisons.
To test if an aIribute y is null, use y is null, or y is not null
null will never match any other value (even null itself), unless we explicitly
use is null or is not null.

11

Queries
Probably the most used and most complex statement in
SQL is the select statement which is used to query
(retrieve) data from a database.
select supports all the rela?onal operators as well as
sor?ng, grouping and aggregate func?ons. The rela?on
produced by a select is normally returned to the user or
client program, but can be used as a subquery in
expressions.
Example:
movie(?tle, year, length, genre, studio, producer)


select ?tle, length


from movie
where studio=fox and year>1990


To return all aIributes, use * for the projected aIributes.
12

Renaming AIributes
We can rename aIributes (and use expressions) in the projec?on part
of a select with the as keyword. Renaming is useful if we have clashing
aIribute names that represent dierent things, or we want to carry
out set opera?ons on rela?ons with diering aIribute names. as can
also be used to rename rela?ons as well see.
Example:


movie(?tle, year, length, genre, studio, producer)


select ?tle as name, length/60 as hours
from movie
where studio=fox and year>1990


For readability or to disambiguate aIributes we can prex the rela?on name,
e.g:

select movie.?tle as name, movie.length/60 as hours

from movie

where movie.studio=fox and movie.year>1990

13

Sor?ng results
In contrast to rela?onal algebra, SQL's select statement can sort the tuples in the resul?ng
rela?on. This is achieved by adding an order by clause at the end of the select.
Example: movie(?tle, year, length, genre, studio, producer)



select ?tle, length


from movie
where studio=fox and year>1990
order by year desc, ?tle asc

This will sort the resul?ng tuples rst by year in descending order, then by ?tle in ascending
order. Note: we can use all the aIributes of movie (e.g. year), not just those in the
projec?on. So the order of evalua?on is from, where, order, select (FWOS).

14

Cartesian Product and Natural Join


The from clause is used to dene a cartesian product or perform various joins.

Example:
movie(?tle, year, length, genre, studio, producer)


cas?ng(?tle, year, name)

select *
from movie , cas?ng

This is equivalent to movie cas?ng in rela?onal algebra


while

select *
from movie natural join cas?ng

is equivalent to movie cas?ng in rela?onal algebra.

15

Theta Join
Theta join is performed with join and an on condi?on or a using aIribute list.
Example:



movie(?tle, year, length, genre, studio, producer)


cas?ng(?tle, year, name)
select ?tle, year, name
from movie join cas?ng on movie.producer=cas?ng.name

This is equivalent to
?tle, year, name(movie.producer=cas?ng.name(movie cas?ng))
using can be used if we want to join on specic aIributes, e.g.

select ?tle
from movie join cas?ng using (?tle, year)

which is the same as on movie.?tle=cas?ng.?tle and







movie.year=cas?ng.year
16

Renaming Rela?ons
To form a query over two tuples from the same rela?on (self-join), we list the rela?on twice
(i.e. perform a cartesian product on itself and rename one or both of the listed rela?ons
using the as keyword. Renamed rela?ons are known as correla>on names.
Example:





movie(?tle, year, length, genre, studio, producer)


cas?ng(?tle, year, name, address)

select cas?ng1.name, cas?ng2.name


from cas?ng as cas?ng1 join cas?ng as cas?ng2
on
cas?ng1.address = cas?ng2.address and


cas?ng1.name < cas?ng2.name

We can also use correla?on names to give us a shorter name to use in other parts of the
query:

select m.?tle, m.studio, a.name


from movie m join cas?ng c on m.producer=c.name

17

Mul?-rela?on Joins
We can join has many rela?ons as we like. The evalua?on is carried out leN to right unless
we use parentheses. Note: in prac?ce a query op?miser rewrites all queries for performance
while maintaining the seman?cs of the query.

Example:
movie(?tle, year, length, genre, studio, producer)


cas?ng(?tle, year, name)


studio(name, address, boss)



select cas?ng.name, movie.producer, studio.boss


from cas?ng join movie using (?tle)



join studio on movie.studio=studio.name
where movie.year >= 1990

18

Union, Intersec?on, Dierence


We can combine rela?ons using the set operators union(), intersect() and except(). We
typically use these operators on the rela?ons generated by selects, which should be
parenthesised.
Example:








actor(name, address, gender, birthdate)


producer(name, address, networth)

(select name, address


from actor
where gender=F)

intersect
(select name, address
from producer
where networth>=100000000)

19

More Joins
le@rela>on JOIN-OPERATOR rightrela>on
inner join returns tuples when there is at least one match in both rela?ons. This corresponds
to the join weve seen in rela?onal algebra.
le@ outer join is like inner join but includes all tuples from the leN rela?on, even if there are
no matches in the right rela?on. Nulls are used for missing values.
right outer join is like inner join but includes all tuples from the right rela?on, even if there
are no matches in the leN rela?on. Nulls are used for missing values.
full outer join is like inner join but includes all umatched tuples from both rela?ons. Nulls
are used for missing values.
The joins above can be natural joins (joined by matching aIributes) or theta joins (joined by
a condi?on).

20

Natural Outerjoins
Although natural and theta joins are oNen whats required, there are occasions when wed
like to retain tuples that dont match, outerjoins give us this capability.
Examples:

4
7

L natural le@ join R

L natural right join R

L natural full join R

null

null

null

null

null

null

2
1

Theta Outerjoins
We can also perform theta outerjoins using le@ outer join, right outer join, or full outer join
along with an on condi?on.
Example:




movie(?tle, year, length, genre, studio, producer)


cas?ng(?tle, year, name)
select ?tle, year, name
from movie le@ outer join cas?ng on

movie.producer=cas?ng.name and movie.year=cas?ng.year

22

Elimina?ng Duplicates
Unlike rela?onal algebra, SQL queries can poten?ally produce duplicate tuples. We can
eliminate them by adding the keyword dis>nct aNer the keyword select. Recall that union,
intersect and except eliminate duplicates unless all is used.
Example:




movie(?tle, year, length, genre, studio, producer)


cas?ng(?tle, year, name)
select dis>nct ?tle, year, name
from movie le@ outer join cas?ng on

movie.producer=cas?ng.name and movie.year=cas?ng.year

23

Aggregate func?ons
The aggregate func?ons sum, avg, min, max and count can be used in a projec?on list to
calculate a single value, either from the whole resul?ng rela?on or from a part of it - see
grouping later). The parameter is typically an a4ribute but can be an expression. Fortunately,
nulls are excluded from these calcula?ons.
count(dis>nct a4ribute) counts the number of dis?nct values of the aIribute. count(*) is an
aggregate func?on that counts the number of tuples in a rela?on or group (including nulls)
Example:











select count(*) as professors,




sum(salary) as totalsalary,

avg(salary) as averagesalary,


min(age) as youngest,


max(age) as oldest
from employee
where posi?on = Professor

Note: this query will produce a rela?on with a single tuple with 5 aIributes.

24

Grouping
The select statement can group the tuples in a resul?ng rela?on. This is achieved by providing
a list of grouping aIributes in a group by clause. If aggregate func?ons are used in the
projec?on list they are applied to each group.
Example: select
department,




count(*) as professors,




sum(salary) as totalsalary,



avg(salary) as averagesalary,




min(age) as youngest,




max(age) as oldest


from employee


where posi?on = Professor


group by department


order by totalsalary desc
This query will produce a rela?on with one tuple for each department. The results
are sorted in descending order by totalsalary.
25

Example
select department,

count(*) as professors,

sum(salary) as totalsalary,

avg(salary) as averagesalary,

min(age) as youngest,

max(age) as oldest
from employee
where posi?on = Professor
group by department
order by totalsalary desc
department

professors

totalsalary

averagesalary

youngest

oldest

physics

30

3000000

100000

35

68

compu?ng

25

2000000

80000

40

65

360000

90000

50

62

...
materials

26

Filtering groups by aggregate func?ons


We can lter groups using a predicate with aggregate func?ons that is applied to each group
by adding a having clause aNer the group by clause.
Example: select department,




count(*) as professors,




sum(salary) as totalsalary,



avg(salary) as averagesalary,




min(age) as youngest,




max(age) as oldest


from employee


where posi?on = Professor


group by department


having count(*)>=10


order by totalsalary desc

This query will produce a rela?on with one tuple for each department that has at
least 10 professors. The results are sorted in descending order by totalsalary.
27

Example
select department,
count(*) as professors,
sum(salary) as totalsalary,
avg(salary) as averagesalary,
min(age) as youngest,
max(age) as oldest
from employee
where posi?on = Professor
group by department
having count(*)>=10
order by totalsalary desc

department

professors

totalsalary

averagesalary

youngest

oldest

physics

30

3000000

100000

35

68

compu?ng

25

2000000

80000

40

65

360000

90000

50

62

...
materials

28

Subqueries
One of the most powerful features of selects is that they can be used as subqueries in
expressions by enclosing them in parentheses i.e. (subquery). SQL supports scalar, set and
rela?ons subqueries.
Scalar
subquery

A subquery that produces a single value. Typically a select with an aggregate


func?on.

Set
subquery

A subquery that produces a set of dis?nct values (a single column).


Typically used for (i) set membership using operators in or not in,
or (ii) set comparisons using operators some(any) or all.

Rela>on
subquery

A subquery that produces a rela?on. Typically used as an operand of (i)


products, joins, unions, intersects, excepts, (iii) operators exists or not exists
to test if a rela?on is empty or not, (iv) operators not unique or unique to test
if a rela?on has duplicates or not.
29

Scalar subquery
A select that produces a single value. Scalar subqueries can be used in any expression, e.g. in
projec?on lists, in where and having clauses. Are oNen selects with a single aggregate
func?on.
Example:






movie(?tle, year, length, genre, studio, producer)


cas?ng(?tle, year, name)

select ?tle,


(select count(name)


from cas?ng

where cas?ng.?tle=movie.?tle) as numactors
from movie;

30

Join instead of subquery


Joins can oNen be used instead of subqueries. Clearer to use joins when possible.
Example:

movie(?tle, year, length, genre, studio, producer)


cas?ng(?tle, year, name)

select ?tle, year



(select count(name)

from cas?ng
where cas?ng.?tle=movie.?tle) as numactors
from movie;

Alterna?ve with joins:


select ?tle, year



count(name) as numactors
from movie join cas?ng using (?tle)
group by ?tle, year
31

Set membership subqueries


Subqueries that produce a set of values can be used to test if a value is a member of the set by
using the in or not in operators.
Example:



movie(?tle, year, length, genre, studio, producer)


cas?ng(?tle, year, name)
studio(name, address, boss)



select ?tle


from movie


where studio in (select name from studio
where address like C%)
We can extend the approach to tuple values enclosed in parentheses:

select name

from cas?ng

where (?tle, year) not in



(select ?tle, year from movie where genre=sf)

32

Join instead of a subquery


Example1: select ?tle from movie


where studio in (select name from studio
where address like C%)




Example2:









Alterna9ve:
select ?tle
from movie join studio on studio.name=movie.studio
where studio.address=C%
select name
from cas?ng
where (?tle, year) not in


(select ?tle, year from movie where genre=sf)
Alterna9ve:
select name
from cas?ng join movie using (?tle, year)
where genre<>sf
33

Set comparison subqueries


We can use subqueries to compare a value against some or all values returned by a subquery,
using the some and all func?ons respec?vely.


Example:
movie(?tle, year, length, genre, studio, producer)



select ?tle
from movie m1
where year < any(select year from movie m2





where m2.?tle=m1.?tle)

All requires:

select name

from employee

where salary <> all(select salary from employee






where posi?on=Professor)

34

Rela?on subqueries
The exists and not exists func?ons with a subquery argument can be used to test whether a
rela?on is empty (has no tuples) or not.

The not unique func?on can be used to test whether a rela?on has duplicates, or hasnt
duplicates with unique.

Example:
movie(?tle, year, length, genre, studio, producer)


cas?ng(?tle, year, name)


studio(name, address, boss)


select ?tle


from movie m1


where not exists(select * from movie m2






where m2.?tle=m1.?tle and m2.year<>m1.year
)

35

DoC Teaching Database

36

DoC Teaching Database

37

Common AIributes

DoC Teaching Database

id, opened, openedby, updated, updatedby, validfrom, validto


Main Rela?ons

sta(login, email, lastname, rstname, telephone, room, deptrole, department)

student(login, email, lastname, status, entryyear, externaldept)


course(code, ?tle, syllabus, term, classes, popes?mate)
class(degreeid, yr, degree, degreeyr, major, majoryr, leIer, leIeryr)
degree(?tle, code, major, grp, leIer, years)
book(code, ?tle, authors, publisher)
Many-to-Many Joining Rela?ons
xcourseclass(courseid, classid, required, examcode)
xcoursebook(courseid, bookid, ra?ng)
xcoursesta(courseid, stad, staours, role, term)
xstudentclass(studen1d, classid)
xstudentsta(studen1d, stad, role, grp, projectle)

Q. List all sta who do not have a College or Department email address, sort results by
lastname
Q. List all sta with the same lastname, show names of sta and their namesake(s)

38

DoC Teaching Database


xstudentclass
sta

student

xstudentsta
xcoursesta

course

class

degree book

xcourseclass
xcoursebook

Each rela?on has several temporal views (named rela9onal expressions), e.g.:

coursecurr - courses for current year
course0910 - courses for academic year 2009-2010, similar for 0809 etc.
39

Examples 1
Q. List all sta who do not have a College or Department email address, sort results by
lastname.



Q. List all sta with the same lastname, show names of sta and their namesake(s)

40

Solu?ons 1
Q. List all sta who do not have a College or Department email address, sort by
lastname.
select id, lastname, email
from stacurr
where not(email like '%imperial.ac.uk' or email like '%doc.ic.ac.uk')
order by lastname


Q. List all sta with the same lastname, show names of sta and their namesake(s)

select s1.id, s1.rstname, s1.lastname,


s2.id, s2.rstname, s2.lastname
from stacurr s1 join stacurr s2 on s1.lastname=s2.lastname and











s1.id < s2.id
order by s1.lastname

4
1

Examples 2
Q. List all books recommended for courses taught by Prof Kelly, similar to:





42

Solu?on 2
Q. List all books recommended for courses taught by Prof Kelly, similar to:




select c.code, c.?tle, xb.ra?ng, b.?tle, b.authors, b.publisher,
b.code
from stacurr s join xcoursestacurr xc on s.id=xc.stad
join coursecurr c on xc.courseid=c.id
join xcoursebookcurr xb on c.id=xb.courseid
join bookcurr b on xb.bookid=b.id
where s.lastname='Kelly'
order by c.code, xb.ra?ng

4
3

Examples 3
Q List courses being taken by student with login rf611, similar to

44

Solu?on 3a
Q List courses being taken by student with login rf6111, similar to







select c.code, c.?tle, xc.examcode, c.term, c.popes?mate, c.classes
from studentcurr s join xstudentclasscurr xa on s.id=xa.studen?d
join classcurr ca on xa.classid=ca.id
join xcourseclasscurr xc on ca.id=xc.classid
join coursecurr c on xc.courseid=c.id
where s.login='rf611'
order by c.code

4
5

Solu?on 3b
Q List courses being taken by student with login rf611, similar to







select c.code, c.?tle, xc.examcode, c.term, c.popes?mate, c.classes
from studentcurr s join xstudentclasscurr xa on s.id=xa.studen?d
join xcourseclasscurr xc on xa.classid=xc.classid
join coursecurr c on xc.courseid=c.id
where s.login='rf611'
order by c.code

4
6

Examples 4
Q. List all PPT tutors and their PPT tutees (role=PPT) order sta and students:
Tutor lastname, tutor rstname, Tutee lastname, rstname suitably sorted.




Q. List all PPT tutors and how many PPT tutees they have, suitably sorted.

47

Solu?ons 4
Q. List all PPT tutors and their PPT tutees (role=PPT) order sta and students:
Tutor lastname, tutor rstname, Tutee lastname, rstname suitably sorted.

select s.id, s.lastname, s.rstname, t.id, t.lastname, t.rstname


from stacurr s join xstudentstacurr x on s.id=x.stad
join studentcurr t on t.id=x.studen?d
where x.role='PPT'
order by s.lastname, s.rstname, t.lastname, t.rstname


Q. List all PPT tutors and how many PPT tutees they have, suitably sorted.

select s.id, s.lastname, s.rstname, count(t.id)


from stacurr s join xstudentstacurr x on s.id=x.stad
join studentcurr t on t.id=x.studen?d
where x.role='PPT'
group by s.id, s.lastname, s.rstname
order by s.lastname, s.rstname
4
8

Examples 5
Q. List all tutoring roles




Q. List all tutoring roles and how many tutors there are for each role





49

Solu?ons 5
Q. List all tutoring roles
select role
from xstudentstacurr
group by role
order by role
Q. List all tutoring roles and how many tutors there are for each role

select role, count(dis>nct stad) as tutors


from xstudentstacurr
group by role
order by role

5
0

You might also like