0% found this document useful (0 votes)
229 views123 pages

2019 02 05 Lecture3 sql2 PDF

The document discusses foreign keys in SQL, noting that while foreign keys do not need to reference primary keys, they must be unique. It also explains that NULL values are distinct, so foreign keys should reference non-NULL attributes. The document provides examples of how to structure tables and foreign key relationships correctly.

Uploaded by

ibnusani31
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
229 views123 pages

2019 02 05 Lecture3 sql2 PDF

The document discusses foreign keys in SQL, noting that while foreign keys do not need to reference primary keys, they must be unique. It also explains that NULL values are distinct, so foreign keys should reference non-NULL attributes. The document provides examples of how to structure tables and foreign key relationships correctly.

Uploaded by

ibnusani31
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 123

SQL (Part 2)

February 5, 2019
Data Science CSCI 1951A
Brown University
Instructor: Ellie Pavlick
HTAs: Wennie Zhang, Maulik Dang, Gurnaaz Kaur

1
Follow up from last time
• Do Foreign Keys need to reference Primary Keys?

• NO!

• But they do have to be unique. (More soon)

• Also, NULLs are all considered distinct (i.e. NULL != NULL), so we’d
want to have the FK reference a attribute that is not NULL too

• i.e. saying FK = NULL will not allow you to reference the other table

• So! You should generally stick to the rule of making FK reference a PK

• If you can’t do this, try refactoring your DB to make it possible, if


you are in a position to do this

2
Follow up from last time
• Do Foreign Keys need to reference Primary Keys?

• NO!

• But they do have to be unique. (More soon)

• Also, NULLs are all considered distinct (i.e. NULL != NULL), so we’d
want to have the FK reference a attribute that is not NULL too

• i.e. saying FK = NULL will not allow you to reference the other table

• So! You should generally stick to the rule of making FK reference a PK

• If you can’t do this, try refactoring your DB to make it possible, if


you are in a position to do this

3
Follow up from last time
• Do Foreign Keys need to reference Primary Keys?

• NO!

• But they do have to be unique. (More soon)

• Also, NULLs are all considered distinct (i.e. NULL != NULL), so we’d
want to have the FK reference a attribute that is not NULL too

• i.e. saying FK = NULL will not allow you to reference the other table

• So! You should generally stick to the rule of making FK reference a PK

• If you can’t do this, try refactoring your DB to make it possible, if


you are in a position to do this

4
Follow up from last time
• Do Foreign Keys need to reference Primary Keys?

• NO!

• But they do have to be unique. (More soon)

• Also, NULLs are all considered distinct (i.e. NULL != NULL), so we’d
want to have the FK reference a attribute that is not NULL too

• i.e. saying FK = NULL will not allow you to reference the other table

• So! You should generally stick to the rule of making FK reference a PK

• If you can’t do this, try refactoring your DB to make it possible, if


you are in a position to do this

5
Follow up from last time
• Do Foreign Keys need to reference Primary Keys?

• NO!

• But they do have to be unique. (More soon)

• Also, NULLs are all considered distinct (i.e. NULL != NULL), so we’d
want to have the FK reference a attribute that is not NULL too

• i.e. saying FK = NULL will not allow you to reference the other table

• So! You should generally stick to the rule of making FK reference a PK

• If you can’t do this, try refactoring your DB to make it possible, if


you are in a position to do this

6
Follow up from last time
• Do Foreign Keys need to reference Primary Keys?

• NO!

• But they do have to be unique. (More soon)

• Also, NULLs are all considered distinct (i.e. NULL != NULL), so we’d
want to have the FK reference a attribute that is not NULL too

• i.e. saying FK = NULL will not allow you to reference the other table

• So! You should generally stick to the rule of making FK reference a PK

• If you can’t do this, try refactoring your DB to make it possible, if


you are in a position to do this

7
Follow up from last time
• Do Foreign Keys need to reference Primary Keys?

• NO!

• But they do have to be unique. (More soon)

• Also, NULLs are all considered distinct (i.e. NULL != NULL), so we’d
want to have the FK reference a attribute that is not NULL too

• i.e. saying FK = NULL will not allow you to reference the other table

• So! You should generally stick to the rule of making FK reference a PK

• If you can’t do this, try refactoring your DB to make it possible, if


you are in a position to do this

8
Follow up from last time
• Do Foreign Keys need to reference Primary Keys?

• NO!

• But they do have to be unique. (More soon)

• Also, NULLs are all considered distinct (i.e. NULL != NULL), so we’d
want to have the FK reference a attribute that is not NULL too

• i.e. saying FK = NULL will not allow you to reference the other table

• So! You should generally stick to the rule of making FK reference a PK

• If you can’t do this, try refactoring your DB to make it possible, if


you are in a position to do this

9
Follow up from last time
Why do foreign keys have to be unique?

Students Possible Donors


First Last Last Name:
Grade Last-Called
Name Name FK
Joe Shmo A two seconds
Shmo
ago
Jane Shmane F
three
Shmane
Joao Shmoao B seconds ago

one second
Elan Shmo D Shmoao
ago

10
Follow up from last time
Why do foreign keys have to be unique?

Students Possible Donors


First Last Last Name:
Grade Last-Called
Name Name FK

two seconds
Shmo
ago
Jane Shmane F
three
Shmane
Joao Shmoao B seconds ago

one second
Elan Shmo D Shmoao
ago

11
Follow up from last time
Why do foreign keys have to be unique?

Students Possible Donors


First Last Last Name:
Grade Last-Called
Name Name FK

two seconds
Shmo
ago
Jane Shmane F
three
Shmane
Joao Shmoao B seconds ago

one second
Elan Shmo D Shmoao
ago

12
Follow up from last time
Why do foreign keys have to be unique?

Students Possible Donors


First Last Last Name:
Grade Last-Called
Name Name FK

two seconds
Shmo
ago
Jane Shmane F
three
Shmane
Joao Shmoao B seconds ago

one second
Elan Shmo D Shmoao
ago

13
Follow up from last time
Why do foreign keys have to be unique?

Students Possible Donors


First Last Last Name:
Grade Last-Called
Name Name FK

two seconds
Shmo
ago
Jane Shmane F
three
Shmane
Joao Shmoao B seconds ago

one second
Elan Shmo D Shmoao
ago

14
Follow up from last time
Why do foreign keys have to be unique?

Students Donations
First Last Last Name:
Grade Amount
Name Name FK

Joe Shmo A Shmo $2

Jane Shmane F Shmane $10

Joao Shmoao B Shmoao $1,000,000

Elan Shmo D Shmo $0.02

15
Follow up from last time
Families
ID Name
1 Shmo
2 Shmane
3 Shmoao
Students Possible Donors
First Last Last Name:
Grade Last-Called
Name Name FK
Joe Shmo A two seconds
Shmo
ago
Jane Shmane F
three
Shmane
Joao Shmoao B seconds ago

one second
Elan Shmo D Shmoao
ago
16
Follow up from last time
• Why would I ever use CHAR(n) as opposed to
VARCHAR(n)? Are there any benefits?

• CHAR(n) is faster

• Can use static memory allocation

• No length checks in operations, so less overhead

• VARCHAR(n) uses less space on average

17
Announcements
• Have pen/paper or sit by someone who does—this
will help for working through longer in-class
exercises

• Please don’t leave early! 3 minutes per day = one


whole lecture! 😩

• Final projects: Start thinking about teams, watch


Piazza, the HTAs are trying to help orchestrate

18
Outline
• Catchup up from last lecture (more SQL keywords)

• NULLs

• Execution Order, Optimization

• Nested Queries, More optimization

• NoSQL (no NoSQL = SQL??? )

19
Outline

• Catchup up from last lecture (more SQL keywords)

• NULLs

• Execution Order, Optimization

• Nested Queries, More optimization

20
TWEET
ORDER BY
ID Time Text
782138 2019-01-04 15:04:57 1951A 4 lyfe SELECT Text
389472 2019-01-01 12:34:56 hey FROM Tweet
123794 2019-01-01 12:34:57 lol ORDER BY Time
127890 2019-01-04 17:30:07 hey
893110 2019-01-06 12:21:53 i <3 1951A
596208 2019-01-02 3:14:15 :-D Text
173902 2019-01-05 3:34:18 i <3 1951A hey
lol
:-D
1951A 4 lyfe
hey
i <3 1951A
i <3 1951A
21
TWEET
ORDER BY
ID Time Text
782138 2019-01-04 15:04:57 1951A 4 lyfe SELECT Text
389472 2019-01-01 12:34:56 hey FROM Tweet
123794 2019-01-01 12:34:57 lol ORDER BY ID
127890 2019-01-04 17:30:07 hey
893110 2019-01-06 12:21:53 i <3 1951A
596208 2019-01-02 3:14:15 :-D Text
173902 2019-01-05 3:34:18 i <3 1951A lol
hey
i <3 1951A
hey
:-D
1951A 4 lyfe
i <3 1951A
22
GROUP BY
TWEET
ID Likes Text SELECT Text,
782138 1,000 1951A 4 lyfe Count(*), AVG(Likes)
389472 10 hey FROM Tweet
123794 100 lol GROUP BY Text
127890 0 hey
893110 8,000,000 i <3 1951A
Text Count(*) AVG(Likes)
596208 1 :-D
lol 1 100
173902 1,000,000,000 i <3 1951A
hey 2 5
i <3 1951A 2 504,000,000
:-D 1 1
1951A 4 lyfe 1 1,000

23
GROUP BY
TWEET
ID Likes Text SELECT Text,
782138 1,000 1951A 4 lyfe Count(*), AVG(Likes)
389472 10 hey FROM Tweet
123794 100 lol GROUP BY Text
127890 0 hey
893110 8,000,000 i <3 1951A
Text Count(*) AVG(Likes)
596208 1 :-D
lol 1 100
173902 1,000,000,000 i <3 1951A
hey 2 5
i <3 1951A 2 504,000,000
SUM, MIN, MAX, :-D 1 1
COUNT, AVG 1951A 4 lyfe 1 1,000

24
HAVING
TWEET
ID Likes Text
782138 1,000 1951A 4 lyfe SELECT Text,
389472 10 hey Count(*), AVG(Likes)
123794 100 lol FROM Tweet
127890 0 hey GROUP BY Text
893110 8,000,000 i <3 1951A HAVING COUNT(*) > 1
596208 1 :-D
173902 1,000,000,000 i <3 1951A Text Count(*) AVG(Likes)

hey 2 5
SUM, MIN, MAX,
i <3 1951A 2 504,000,000
COUNT, AVG
25
LIKE
TWEET
ID Likes Text
SELECT Text, Count(*),
AVG(Likes)
782138 1,000 1951A 4 lyfe
FROM Tweet
389472 10 hey
WHERE Text LIKE ‘%1951A%’
123794 100 lol
GROUP BY Text
127890 0 hey
893110 8,000,000 i <3 1951A
596208 1 :-D
Text Count(*) AVG(Likes)
173902 1,000,000,000 i <3 1951A
1951A 4 lyfe 1 1,000

i <3 1951A 2 504,000,000

26
STUDENT
IN
ID Name
1 Wennie
2 Maulik SELECT Name
3 Gurnaaz FROM STUDENT
4 Jens WHERE ID IN
5 Erin (SELECT Student
FROM GRADES
GRADES
WHERE Course = 1951A
)
Student Course Grade
1 32 A
2 1951A A Find names of
6 32 A students in 1951A

27
“Subquery”
STUDENT
IN (More later, get
ID Name excited)
1 Wennie
2 Maulik SELECT Name
3 Gurnaaz FROM STUDENT
4 Jens WHERE ID IN
5 Erin (SELECT Student
FROM GRADES
GRADES
WHERE Course = 1951A
)
Student Course Grade
1 32 A
2 1951A A Find names of
6 32 A students in 1951A

28
IN Returns “bag”
STUDENT
of student IDs
ID Name
1 Wennie
2 Maulik SELECT Name
3 Gurnaaz FROM STUDENT
4 Jens WHERE ID IN
5 Erin (SELECT Student
FROM GRADES
GRADES
WHERE Course = 1951A
)
Student Course Grade
1 32 A
2 1951A A Find names of
6 32 A students in 1951A

29
IN Returns True if
STUDENT
ID is in that bag
ID Name
1 Wennie
2 Maulik SELECT Name
3 Gurnaaz FROM STUDENT
4 Jens WHERE ID IN
5 Erin (SELECT Student
FROM GRADES
GRADES
WHERE Course = 1951A
)
Student Course Grade
1 32 A
2 1951A A Find names of
6 32 A students in 1951A

30
STUDENT
ALL/ANY
ID Name
1 Wennie SELECT Grade
2 Maulik FROM GRADES
3 Gurnaaz
WHERE Course = “1951A”
4 Jens
AND Grade >= ALL
5 Erin
(SELECT Grade
FROM GRADES
GRADES WHERE Course = 1951A
Student Course Grade )
1 1951A 3.5
2 1951A 3.5 What is the highest
6 1951A 2.8 grade in 1951A?

31
STUDENT
ALL/ANYReturns True if condition holds
ID Name for all tuples in bag
1 Wennie SELECT Grade
2 Maulik FROM GRADES
3 Gurnaaz
WHERE Course = “1951A”
4 Jens
AND Grade >= ALL
5 Erin
(SELECT Grade
FROM GRADES
GRADES WHERE Course = 1951A
Student Course Grade )
1 1951A 3.5
2 1951A 3.5 What is the highest
6 1951A 2.8 grade in 1951A?

32
STUDENT
ALL/ANY
ID Name
1 Wennie SELECT Grade
2 Maulik FROM GRADES
3 Gurnaaz
WHERE Course = “1951A”
4 Jens
AND Grade > ANY
5 Erin
(SELECT Grade
FROM GRADES
GRADES WHERE Course = 1951A
Student Course Grade )
1 1951A 3.5
2 1951A 3.5
???
6 1951A 2.8

33
STUDENT
ALL/ANY
ID Name
1 Wennie SELECT Grade
2 Maulik FROM GRADES
3 Gurnaaz
WHERE Course = “1951A”
4 Jens
AND Grade > ANY
5 Erin
(SELECT Grade
FROM GRADES
GRADES WHERE Course = 1951A
Student Course Grade )
1 1951A 3.5
2 1951A 3.5 Return all grades
6 1951A 2.8 except the lowest one.

34
STUDENT
ALL/ANY
ID Name
1 Wennie SELECT Grade
2 Maulik FROM GRADES
3 Gurnaaz
WHERE Course = “1951A”
4 Jens
AND Grade > NOT ANY
5 Erin
(SELECT Grade
FROM GRADES
GRADES WHERE Course = 1951A
Student Course Grade )
1 1951A 3.5
2 1951A 3.5 Return the lowest grade.
6 1951A 2.8

35
STUDENT
ALL/ANY
ID Name
SELECT Grade
1 Wennie FROM GRADES
2 Maulik WHERE Course = “1951A”
3 Gurnaaz AND Grade >= ALL
4 Jens (SELECT Grade
5 Erin FROM GRADES
WHERE Course = 1951A
GRADES
)
Student Course Grade
1 1951A 3.5
2 1951A 3.5 Grade
6 1951A 2.8

36
STUDENT
ALL/ANY
ID Name
SELECT Grade
1 Wennie FROM GRADES
2 Maulik WHERE Course = “1951A”
3 Gurnaaz AND Grade >= ALL
4 Jens (SELECT Grade
5 Erin FROM GRADES
WHERE Course = 1951A
GRADES
)
Student Course Grade
1 1951A 3.5
2 1951A 3.5 Grade
6 1951A 2.8 3.5
3.5

37
STUDENT
DISTINCT
ID Name
SELECT DISTINCT Grade
1 Wennie FROM GRADES
2 Maulik WHERE Course = “1951A”
3 Gurnaaz AND Grade >= ALL
4 Jens (SELECT Grade
5 Erin FROM GRADES
WHERE Course = 1951A
GRADES
)
Student Course Grade
1 1951A 3.5
2 1951A 3.5 Grade
6 1951A 2.8 3.5

38
STUDENT
DISTINCT
ID Name
SELECT DISTINCT Grade
1 Wennie FROM GRADES
2 Maulik WHERE Course = “1951A”
3 Gurnaaz AND Grade >= ALL
4 Jens (SELECT Grade
5 Erin FROM GRADES
WHERE Course = 1951A
GRADES
)
Student Course Grade
1 1951A 3.5
2 1951A 3.5 Grade
6 1951A 2.8 3.5

Set operations (Union, Intersection,


etc.) remove duplicates
39 by default.
STUDENT
EXISTS
ID Name
1 Wennie SELECT NAME
2 Maulik FROM STUDENT s
3 Gurnaaz
WHERE NOT EXISTS
4 Jens
(SELECT *
5 Erin
FROM GRADES
WHERE Course = 1951A
GRADES AND Student = s.ID
Student Course Grade )
1 1951A 3.5
2 1951A 3.5
???
6 1951A 2.8

40
EXISTS
STUDENT True as long as bag is not empty
ID Name
1 Wennie SELECT NAME
2 Maulik FROM STUDENT s
3 Gurnaaz
WHERE NOT EXISTS
4 Jens
(SELECT *
5 Erin
FROM GRADES
WHERE Course = 1951A
GRADES AND Student = s.ID
Student Course Grade )
1 1951A 3.5
2 1951A 3.5
???
6 1951A 2.8

41
STUDENT
EXISTS
ID Name
1 Wennie SELECT NAME
2 Maulik FROM STUDENT s
3 Gurnaaz
WHERE NOT EXISTS
4 Jens
(SELECT *
5 Erin
FROM GRADES
WHERE Course = 1951A
GRADES AND Student = s.ID
Student Course Grade )
1 1951A 3.5
2 1951A 3.5 Students who are
6 1951A 2.8 not in 1951A

42
Outline
• Catchup up from last lecture (more SQL keywords)

• NULLs

• Execution Order, Optimization

• Nested Queries, More optimization

• NoSQL (no NoSQL = SQL??? ::mindblown::)

43
NULL!
• Black hole! NULL is NULL is NULL and there is no coming back from it…

• If an operand is NULL, the result is NULL:

• NULL + 1 = NULL

• NULL * 0 = NULL

• Comparisons: All comparisons that involve a null value, evaluate to


unknown

• NULL = NULL -> Unknown

• NULL < 13 -> Unknown

• NULL > NULL -> Unknown

44
NULL!
p q p OR q p AND q p=q

TRUE TRUE TRUE TRUE TRUE

TRUE FALSE TRUE FALSE FALSE

FALSE TRUE TRUE FALSE FALSE

FALSE FALSE FALSE FALSE FALSE

45
NULL!
p q p OR q p AND q p=q
TRUE TRUE TRUE TRUE TRUE
TRUE FALSE TRUE FALSE FALSE
FALSE TRUE TRUE FALSE FALSE
FALSE FALSE FALSE FALSE FALSE
TRUE UNK TRUE UNK UNK
FALSE UNK UNK FALSE UNK
UNK TRUE TRUE UNK UNK
UNK FALSE UNK FALSE UNK
UNK UNK UNK UNK UNK

46
NULL!
WHERE: Only tuples which evaluate to true are
part of the query result. (I.e. unknown and
false treated equivalently.)
TWEET
ID Text Likes
SELECT COUNT(*)
389472 NULL 100 FROM TWEET
123794 NULL 3 WHERE Likes != 10
596208 :-D NULL
782138 1951A 4 lyfe NULL
173902 i <3 1951A 19 Count(*)
893110 i <3 1951A 7539
4

47
NULL!
GROUP BY: If NULL exists, then there is a group for NULL.

TWEET SELECT Text, COUNT(*)


ID Text Likes FROM TWEET
389472 NULL 100 GROUP BY Text
123794 NULL 3
596208 :-D NULL
782138 1951A 4 lyfe NULL
Text Count(*)
173902 i <3 1951A 19
NULL 2
893110 i <3 1951A 7539
:-D 1
1951A 4 lyfe 1
48 i <3 1951A 2
NULL!
For predicates with NULL, use IS (e.g. not “=“)

TWEET SELECT Text ID


ID Text Likes FROM TWEET
389472 NULL 100 WHERE Text = NULL
123794 NULL 3
596208 :-D NULL
782138 1951A 4 lyfe NULL
173902 i <3 1951A 19
ID
893110 i <3 1951A 7539

49
NULL!
For predicates with NULL, use IS (e.g. not “=“)

TWEET SELECT Text ID


ID Text Likes FROM TWEET
389472 NULL 100 WHERE Text IS NULL
123794 NULL 3
596208 :-D NULL
782138 1951A 4 lyfe NULL
173902 i <3 1951A 19
ID
893110 i <3 1951A 7539
389472
123794
50
NULL!

• count(att): NULL is ignored


• sum(att): NULL is ignored
• avg(att): results from SUM and COUNT
• min(att) and max(att): NULL is ignored
• Exception! If NULL is the only value in the column,
then sum/avg/min/max all return “NULL”

51
Clicker Question!

SELECT COUNT(*) Count(*)


FROM TWEET 100

SELECT COUNT(*) Count(*)


FROM TWEET
WHERE Text = “:)” 15

SELECT COUNT(*)
What will be the result of FROM TWEET
this query? WHERE Text != “:)”

(a) (b) (c)


Count(*) Count(*)
I…don’t…know…
100 85
52
Clicker Question!

SELECT COUNT(*) Count(*)


FROM TWEET 100

SELECT COUNT(*) Count(*)


FROM TWEET
WHERE Text = “:)” 15 Can’t say
how many
SELECT COUNT(*)are NULL
What will be the result of FROM TWEET
this query? WHERE Text != “:)”

(a) (b) (c)


Count(*) Count(*)
I…don’t…know…
100 85
53
RUNNERS
Clicker Question! RACES
ID Name Event_ID Event Winner_ID

1 Wennie 1 Wennie 2

2 Maulik 2 Maulik 3

3 Gurnaaz 3 Gurnaaz 2

4 Haomo 4 Haomo NULL

What will be the result of the below query?

SELECT COUNT(*)
FROM RUNNERS
WHERE ID NOT IN SELECT(Winner_ID FROM RACES)
(a) (b) (c)
Count(*) Count(*) Count(*)

0 1 2
54
RUNNERS
Clicker Question! RACES
ID Name Event_ID Event Winner_ID

1 Wennie 1 Wennie 2

2 Maulik 2 Maulik 3

3 Gurnaaz 3 Gurnaaz 2

4 Haomo 4 Haomo NULL

What will be the result of the below query?

SELECT COUNT(*) ID NOT IN (2,3,NULL) is the same


FROM RUNNERS as ID!=2 AND ID!=3 and ID!=NULL
WHERE ID NOT IN SELECT(Winner_ID FROM RACES)
(a) (b) (c)
Count(*) Count(*) Count(*)

0 1 2
55
Outline
• Catchup up from last lecture (more SQL keywords)

• NULLs

• Execution Order, Optimization

• Nested Queries, More optimization

• NoSQL (no NoSQL = SQL??? ::mindblown::)

56
Relational Algebra Recap
• σ<condition>(S): select, return a relation containing just the tuples in
S that meet condition

• π<attribute_list>(S): project, return a relation S’ containing the


following: for each tuple t in S there is a tuple t’ in S’ that contains
the attributes of t that are in attribute list

• ∪(S,S’): union, typical set-theoretic definitions (same for


intersection, minus)

• S × S’: cross product, return a new relation S’’ such that, for every
t in S and t’ in S’, (t, t’) is in S’’.

• ρR(S): rename the relation S as to R

57
> ( S ) : s e lect
on
σ<conditi s t > ( S ):
e _ l i
π<attribut
ion
SQL -> Relational Algebra ∪( S , S ’
S × S’: cr
)
)

:
: u

re
n
o
n
s
a
s
m
p
e
r o d u ct

TWEET ρR(S
ID Time Text
389472 12:34:56 hey
123794 12:34:57 lol SQL
596208 3:14:15 :-D
782138 15:04:57 1951A 4 lyfe SELECT ID, Text
173902 3:34:18 i <3 1951A FROM TWEET
893110 12:21:53 i <3 1951A

Relational Algebra
ID Text
389472 hey ???
123794 lol
596208 :-D
782138 1951A 4 lyfe
173902 i <3 1951A
893110 i <3 1951A
58
> ( S ) : s e lect
on
σ<conditi s t > ( S ):
e _ l i
π<attribut
ion
SQL -> Relational Algebra ∪( S , S ’
S × S’: cr
)
)

:
: u

re
n
o
n
s
a
s
m
p
e
r o d u ct

TWEET ρR(S
ID Time Text
389472 12:34:56 hey
123794 12:34:57 lol SQL
596208 3:14:15 :-D
782138 15:04:57 1951A 4 lyfe SELECT ID, Text
173902 3:34:18 i <3 1951A FROM TWEET
893110 12:21:53 i <3 1951A

Relational Algebra
ID Text
389472 hey
123794 lol π<ID,Text>(TWEET)
596208 :-D
782138 1951A 4 lyfe
173902 i <3 1951A
893110 i <3 1951A
59
> ( S ) : s e lect
on
σ<conditi s t > ( S ):
e _ l i
π<attribut
ion
SQL -> Relational Algebra ∪( S , S ’
S × S’: cr
)
)

:
: u

re
n
o
n
s
a
s
m
p
e
r o d u ct

TWEET ρR(S
ID Time Text
389472 12:34:56 hey
123794 12:34:57 lol SQL
596208 3:14:15 :-D
782138 15:04:57 1951A 4 lyfe SELECT ID, Text
173902 3:34:18 i <3 1951A FROM TWEET
893110 12:21:53 i <3 1951A WHERE Text = “hey”
Relational Algebra
ID Text
389472 hey ???

60
> ( S ) : s e lect
on
σ<conditi s t > ( S ):
e _ l i
π<attribut
ion
SQL -> Relational Algebra ∪( S , S ’
S × S’: cr
)
)

:
: u

re
n
o
n
s
a
s
m
p
e
r o d u ct

TWEET ρR(S
ID Time Text
389472 12:34:56 hey
123794 12:34:57 lol SQL
596208 3:14:15 :-D
782138 15:04:57 1951A 4 lyfe SELECT ID, Text
173902 3:34:18 i <3 1951A FROM TWEET
893110 12:21:53 i <3 1951A WHERE Text = “hey”
Relational Algebra
ID Text
389472 hey π<ID,Text>(σText=“hey”(TWEET))

61
Clicker Question!

Do these queries return the same relation?


(a) Yep (b) Nah
TWEET
ID Time Text
389472 12:34:56 hey
123794 12:34:57 lol
SELECT Text FROM TWEET
596208 3:14:15 :-D
782138 15:04:57 1951A 4 lyfe π<Text>(TWEET)
173902 3:34:18 i <3 1951A
893110 12:21:53 i <3 1951A

62
Clicker Question!

Do these queries return the same relation?


(a) Yep (b) Nah
TWEET
ID Time Text
389472 12:34:56 hey
123794 12:34:57 lol
SELECT Text FROM TWEET
596208 3:14:15 :-D
782138 15:04:57 1951A 4 lyfe π<Text>(TWEET)
173902 3:34:18 i <3 1951A
893110 12:21:53 i <3 1951A

63
Clicker Question!

Do these queries return the same relation?


(a) Yep (b) Nah
TWEET
ID Time Text SELECT DISTINCT
389472 12:34:56 hey
123794 12:34:57 lol
Text FROM TWEET
596208 3:14:15 :-D
782138 15:04:57 1951A 4 lyfe π<Text>(TWEET)
173902 3:34:18 i <3 1951A
893110 12:21:53 i <3 1951A

64
> ( S ) : s e lect
on
σ<conditi s t > ( S ):
e _ l i
π<attribut
ion
SQL -> Relational Algebra ∪( S , S ’
S × S’: cr
)
)

:
: u

re
n
o
n
s
a
s
m
p
e
r o d u ct

ρR(S
PERSON RETWEET
Handle Name Person Tweet SQL
m Maulik m 1 SELECT Name
w Wennie m 2 FROM PERSON, RETWEET
g Gurnaaz w 1 WHERE PERSON.Handle =
RETWEET.Person

Relational Algebra

???

65
> ( S ) : s e lect
on
σ<conditi s t > ( S ):
e _ l i
π<attribut
ion
SQL -> Relational Algebra ∪( S , S ’
S × S’: cr
)
)

:
: u

re
n
o
n
s
a
s
m
p
e
r o d u ct

ρR(S
PERSON RETWEET
Handle Name Person Tweet SQL
m Maulik m 1 SELECT Name
w Wennie m 2 FROM PERSON, RETWEET
g Gurnaaz w 1 WHERE PERSON.Handle =
RETWEET.Person

Relational Algebra

π<Name>(σPERSON.Handle = RETWEET.Person(
PERSON × RETWEET)
)
66
> ( S ) : s e lect
on
σ<conditi s t > ( S ):
e _ l i
π<attribut
ion
SQL -> Relational Algebra ∪( S , S ’
S × S’: cr
)
)

:
: u

re
n
o
n
s
a
s
m
p
e
r o d u ct

ρR(S
PERSON RETWEET
Handle Name Person Tweet SQL
m Maulik m 1 SELECT Name
w Wennie m 2 FROM PERSON AS p,
g Gurnaaz w 1 RETWEET AS r
WHERE r.Person = p.Handle

Relational Algebra

???

67
> ( S ) : s e lect
on
σ<conditi s t > ( S ):
e _ l i
π<attribut
ion
SQL -> Relational Algebra ∪( S , S ’
S × S’: cr
)
)

:
: u

re
n
o
n
s
a
s
m
p
e
r o d u ct

ρR(S
PERSON RETWEET
Handle Name Person Tweet SQL
m Maulik m 1 SELECT Name
w Wennie m 2 FROM PERSON AS p,
g Gurnaaz w 1 RETWEET AS r
WHERE r.Person = p.Handle

Relational Algebra

πName(σp.Handle = r.Person(
ρp(PERSON) × ρr(RETWEET)
)
68
Execution Order
πA1…Ak
SELECT A1…An σP
FROM R1…Rk
WHERE P ×
× Rk
× …
R1 R2

69
Execution Order
πA1…Ak
SELECT A1…An σP
FROM R1…Rk
WHERE P ×
× Rk
× …
R1 R2

70
Execution Order
πA1…Ak
SELECT A1…An σP
FROM R1…Rk
WHERE P ×
× Rk
× …
R1 R2

71
Execution Order
πA1…Ak
SELECT A1…An σP
FROM R1…Rk
WHERE P ×
× Rk
× …
R1 R2

72
SQL -> Relational Algebra
TWEET
ID Time Text SQL
389472 12:34:56 hey
123794 12:34:57 lol SELECT ID, Text
596208 3:14:15 :-D FROM TWEET
782138 15:04:57 1951A 4 lyfe
WHERE Text = “hey”
173902 3:34:18 i <3 1951A
893110 12:21:53 i <3 1951A

Relational Algebra
ID Text
πID,Text
389472 hey

σText=“hey”

TWEET
73
SQL -> Relational Algebra
TWEET
ID Time Text SQL
389472 12:34:56 hey
123794 12:34:57 lol SELECT ID, Text
596208 3:14:15 :-D FROM TWEET
782138 15:04:57 1951A 4 lyfe
WHERE Text = “hey”
173902 3:34:18 i <3 1951A
893110 12:21:53 i <3 1951A

Relational Algebra
ID Text
σText=“hey”
389472 hey

A query can have πID,Text


multiple
“equivalent” trees TWEET
74
Clicker Question!

Which is better?

(a) σ<condition>(π<attr_list>(R))

(b) π<attr_list>(σ<condition>(R))

75
Clicker Question!

Which is better?

(a) σ<condition>(π<attr_list>(R))

(b) π<attr_list>(σ<condition>(R))

76
> ( S ) : s e lect
on
σ<conditi s t > ( S ):
e _ l i
π<attribut
ion
SQL -> Relational Algebra ∪( S , S ’
S × S’: cr
)
)

:
: u

re
n
o
n
s
a
s
m
p
e
r o d u ct

TWEET ρR(S
ID Time Text
389472 12:34:56 hey
123794 12:34:57 lol SQL
596208 3:14:15 :-D
782138 15:04:57 1951A 4 lyfe SELECT ID, Text
173902 3:34:18 i <3 1951A FROM TWEET
893110 12:21:53 i <3 1951A WHERE Text = “hey”
Relational Algebra

σText=“hey”(π<ID,Text>(TWEET))

π<ID,Text>(σText=“hey”(TWEET))

77
> ( S ) : s e lect
on
σ<conditi s t > ( S ):
e _ l i
π<attribut
ion
SQL -> Relational Algebra ∪( S , S ’
S × S’: cr
)
)

:
: u

re
n
o
n
s
a
s
m
p
e
r o d u ct

TWEET ρR(S
ID Time Text
389472 12:34:56 hey
123794 12:34:57 lol SQL
596208 3:14:15 :-D
782138 15:04:57 1951A 4 lyfe SELECT ID, Time
173902 3:34:18 i <3 1951A FROM TWEET
893110 12:21:53 i <3 1951A WHERE Text = “hey”
Relational Algebra

σText=“hey”(π<ID,Time>(TWEET))

π<ID,Time>(σText=“hey”(TWEET))

78
> ( S ) : s e lect
on
σ<conditi s t > ( S ):
e _ l i
π<attribut
ion
SQL -> Relational Algebra ∪( S , S ’
S × S’: cr
)
)

:
: u

re
n
o
n
s
a
s
m
p
e
r o d u ct

TWEET ρR(S
ID Time Text
389472 12:34:56 hey
123794 12:34:57 lol SQL
596208 3:14:15 :-D
782138 15:04:57 1951A 4 lyfe SELECT ID, Time
173902 3:34:18 i <3 1951A FROM TWEET
893110 12:21:53 i <3 1951A WHERE Text = “hey”
Relational Algebra

σText=“hey”(π<ID,Time>(TWEET))

π<ID,Time>(σText=“hey”(TWEET))

79
> ( S ) : s e lect
on
σ<conditi s t > ( S ):
e _ l i
π<attribut
ion
SQL -> Relational Algebra ∪( S , S ’
S × S’: cr
)
)

:
: u

re
n
o
n
s
a
s
m
p
e
r o d u ct

TWEET ρR(S
ID Time Text
389472 12:34:56 hey
123794 12:34:57 lol SQL
596208 3:14:15 :-D
782138 15:04:57 1951A 4 lyfe SELECT ID, Time
173902 3:34:18 i <3 1951A FROM TWEET
893110 12:21:53 i <3 1951A WHERE Text = “hey”
Relational Algebra
ID Time Text
389472 12:34:56 hey
123794 12:34:57 lol
σText=“hey”(π<ID,Time>(TWEET))
596208 3:14:15 :-D
782138 15:04:57 1951A 4 lyfe π<ID,Time>(σText=“hey”(TWEET))
173902 3:34:18 i <3 1951A
893110 12:21:53 i <3 1951A
80
> ( S ) : s e lect
on
σ<conditi s t > ( S ):
e _ l i
π<attribut
ion
SQL -> Relational Algebra ∪( S , S ’
S × S’: cr
)
)

:
: u

re
n
o
n
s
a
s
m
p
e
r o d u ct

TWEET ρR(S
ID Time Text
389472 12:34:56 hey
123794 12:34:57 lol SQL
596208 3:14:15 :-D
782138 15:04:57 1951A 4 lyfe SELECT ID, Time
173902 3:34:18 i <3 1951A FROM TWEET
893110 12:21:53 i <3 1951A WHERE Text = “hey”
Relational Algebra
ID Time
389472 12:34:56
123794 12:34:57 σText=“hey”(π<ID,Time>(TWEET))
596208 3:14:15
782138 15:04:57 π<ID,Time>(σText=“hey”(TWEET))
173902 3:34:18
893110 12:21:53
81
> ( S ) : s e lect
on
σ<conditi s t > ( S ):
e _ l i
π<attribut
ion
SQL -> Relational Algebra ∪( S , S ’
S × S’: cr
)
)

:
: u

re
n
o
n
s
a
s
m
p
e
r o d u ct

TWEET ρR(S
ID Time Text
389472 12:34:56 hey
123794 12:34:57 lol SQL
596208 3:14:15 :-D
782138 15:04:57 1951A 4 lyfe SELECT ID, Time
173902 3:34:18 i <3 1951A FROM TWEET
893110 12:21:53 i <3 1951A WHERE Text = “hey”
Relational Algebra
ID Time
389472 12:34:56
123794 12:34:57 σText=“hey”(π<ID,Time>(TWEET))
596208 3:14:15
782138 15:04:57 π<ID,Time>(σText=“hey”(TWEET))
173902 3:34:18
893110 12:21:53
82
> ( S ) : s e lect
on
σ<conditi s t > ( S ):
e _ l i
π<attribut
ion
SQL -> Relational Algebra ∪( S , S ’
S × S’: cr
)
)

:
: u

re
n
o
n
s
a
s
m
p
e
r o d u ct

TWEET ρR(S
ID Time Text
389472 12:34:56 hey
123794 12:34:57 lol SQL
596208 3:14:15 :-D
782138 15:04:57 1951A 4 lyfe SELECT ID, Time
173902 3:34:18 i <3 1951A FROM TWEET
893110 12:21:53 i <3 1951A WHERE Text = “hey”
Relational Algebra
ID Time Text
389472 12:34:56 hey
123794 12:34:57 lol
σText=“hey”(π<ID,Time>(TWEET))
596208 3:14:15 :-D
782138 15:04:57 1951A 4 lyfe π<ID,Time>(σText=“hey”(TWEET))
173902 3:34:18 i <3 1951A
893110 12:21:53 i <3 1951A
83
> ( S ) : s e lect
on
σ<conditi s t > ( S ):
e _ l i
π<attribut
ion
SQL -> Relational Algebra ∪( S , S ’
S × S’: cr
)
)

:
: u

re
n
o
n
s
a
s
m
p
e
r o d u ct

TWEET ρR(S
ID Time Text
389472 12:34:56 hey
123794 12:34:57 lol SQL
596208 3:14:15 :-D
782138 15:04:57 1951A 4 lyfe SELECT ID, Time
173902 3:34:18 i <3 1951A FROM TWEET
893110 12:21:53 i <3 1951A WHERE Text = “hey”
Relational Algebra
ID Time Text
389472 12:34:56 hey σText=“hey”(π<ID,Time>(TWEET))

π<ID,Time>(σText=“hey”(TWEET))

84
> ( S ) : s e lect
on
σ<conditi s t > ( S ):
e _ l i
π<attribut
ion
SQL -> Relational Algebra ∪( S , S ’
S × S’: cr
)
)

:
: u

re
n
o
n
s
a
s
m
p
e
r o d u ct

TWEET ρR(S
ID Time Text
389472 12:34:56 hey
123794 12:34:57 lol SQL
596208 3:14:15 :-D
782138 15:04:57 1951A 4 lyfe SELECT ID, Time
173902 3:34:18 i <3 1951A FROM TWEET
893110 12:21:53 i <3 1951A WHERE Text = “hey”
Relational Algebra
ID Time
389472 12:34:56 σText=“hey”(π<ID,Time>(TWEET))

π<ID,Time>(σText=“hey”(TWEET))

85
Execution Order
πA1…Ak
SELECT A1…An σP
FROM R1…Rk
WHERE P ×
× Rk
“Canonical Execution Order” × …
(FROM WHERE SELECT) R1 R2

86
Clicker Question!
How much memory do I need?
say each R has
O(m) tuples πA1…Ak
SELECT A1…An σP
FROM R1…Rk
WHERE P ×
× Rk
× …
R1 R2

87
Clicker Question!
How much memory do I need?
say each R has
O(m) tuples πA1…Ak
SELECT A1…An σP
FROM R1…Rk
WHERE P ×
× Rk
(a) O(mk)
× …
(b) O(m x k) R1 R2
(c) O(m + k)
(d) O(mk-n) 88
Clicker Question!
How much memory do I need?
say each R has
O(m) tuples πA1…Ak
SELECT A1…An σP
FROM R1…Rk
WHERE P ×
× Rk
(a) O(mk)
× …
(b) O(m x k) R1 R2
(c) O(m + k)
(d) O(mk-n) 89
Clicker Question!
How much memory do I need?
say each R has
O(m) tuples πA1…Ak
SELECT A1…An σP
FROM R1…Rk
WHERE P ×
× Rk
(a) O(mk) mxm
× …
(b) O(m x k) R1 R2
(c) O(m + k)
(d) O(mk-n) 90
Clicker Question!
How much memory do I need?
say each R has
O(m) tuples πA1…Ak
SELECT A1…An σP
FROM R1…Rk
WHERE P (m x m) x m ×
× Rk
(a) O(mk) mxm
× …
(b) O(m x k) R1 R2
(c) O(m + k)
(d) O(mk-n) 91
Clicker Question!
How much memory do I need?
say each R has
O(m) tuples πA1…Ak
SELECT A1…An σP ((k-1) x m) x m
FROM R1…Rk
WHERE P (m x m) x m ×
× Rk
(a) O(mk) mxm
× …
(b) O(m x k) R1 R2
(c) O(m + k)
(d) O(mk-n) 92
Clicker Question!
How much memory do I need?
say each R has
O(m) tuples πA1…Ak
SELECT A1…An σP ((k-1) x m) x m
FROM R1…Rk
WHERE P (m x m) x m ×
× Rk
(a) O(mk) mxm
× …
(b) O(m x k) R1 R2
(c) O(m + k)
m = 1000, k = 3 —> 1 billion tuples
(d) O(mk-n) 93
Execution Order
πA1…Ak
SELECT A1…An σP
FROM R1…Rk
WHERE P ×
× Rk
× …
R1 R2

“Canonical Execution Order” (FROM WHERE SELECT)


94
Execution Order
SELECT TWEET.Time
FROM TWEET, AUTHOR
WHERE AUTHOR.TWEET = TWEET.ID
and TWEET.Date == ’01/01/2019‘
and AUTHOR.Person = “BarackObama”

πTWEET.Time
σ(A.TWEET = T.ID)⋀(T.Date=“1/1/19”)⋀(A.Person =“BarakckObama”)

×
TWEET AUTHOR

“Canonical Execution Order” (FROM WHERE SELECT)


95
Execution Order
SELECT TWEET.Time
FROM TWEET, AUTHOR
WHERE AUTHOR.TWEET = TWEET.ID
and TWEET.Date == ’01/01/2019‘
and AUTHOR.Person = “BarackObama”

πTWEET.Time
σ(A.TWEET = T.ID)⋀(T.Date=“1/1/19”)⋀(A.Person =“BarakckObama”)

×
6,000 /second =
TWEET AUTHOR
500M/day =
Billions and billions
http://www.internetlivestats.com/twitter-statistics/
96 https://www.omnicoreagency.com/twitter-statistics/
Execution Order
SELECT TWEET.Time
FROM TWEET, AUTHOR
WHERE AUTHOR.TWEET = TWEET.ID
and TWEET.Date == ’01/01/2019‘
and AUTHOR.Person = “BarackObama”

πTWEET.Time
σ(A.TWEET = T.ID)⋀(T.Date=“1/1/19”)⋀(A.Person =“BarakckObama”)

×
6,000 /second =
TWEET AUTHOR
500M/day =
100s of millions
Billions and billions
http://www.internetlivestats.com/twitter-statistics/
97 https://www.omnicoreagency.com/twitter-statistics/
Execution Order
SELECT TWEET.Time
FROM TWEET, AUTHOR
WHERE AUTHOR.TWEET = TWEET.ID
and TWEET.Date == ’01/01/2019‘
and AUTHOR.Person = “BarackObama”

πTWEET.Time
σ(A.TWEET = T.ID)⋀(T.Date=“1/1/19”)⋀(A.Person =“BarakckObama”)

O(really ****ing big) ×


TWEET AUTHOR

98
Execution Order
SELECT TWEET.Time
FROM TWEET, AUTHOR
WHERE AUTHOR.TWEET = TWEET.ID
and TWEET.Date == ’01/01/2019‘
and AUTHOR.Person = “BarackObama”

πTWEET.Time
σ(A.TWEET = T.ID)⋀(T.Date=“1/1/19”)⋀(A.Person =“BarakckObama”)

O(kind of tiny) ×
TWEET AUTHOR

99
Execution Order
SELECT TWEET.Time
FROM TWEET, AUTHOR
WHERE AUTHOR.TWEET = TWEET.ID
and TWEET.Date == ’01/01/2019‘
and AUTHOR.Person = “BarackObama”

πTWEET.Time
σ(A.TWEET = T.ID)⋀(T.Date=“1/1/19”)⋀(A.Person =“BarakckObama”)

×
TWEET AUTHOR
Thoughts??
100
Execution Order
SELECT TWEET.Time
FROM TWEET, AUTHOR
WHERE AUTHOR.TWEET = TWEET.ID
and TWEET.Date == ’01/01/2019‘
and AUTHOR.Person = “BarackObama”

πTWEET.Time
σ(A.TWEET = T.ID)⋀(A.Person =“BarakckObama”)

×
AUTHOR
σDate=“1/1/19”
TWEET
101
Execution Order
SELECT TWEET.Time
FROM TWEET, AUTHOR
WHERE AUTHOR.TWEET = TWEET.ID
and TWEET.Date == ’01/01/2019‘
and AUTHOR.Person = “BarackObama”

πTWEET.Time
σA.TWEET = T.ID

×
σDate=“1/1/19” σPerson =“BarakckObama”

TWEET AUTHOR
102
Clicker Question! (Demand?)
Find grades of
students taking
Optimize this. 1951A ahead of
STUDENT schedule
ID Name Year
SELECT Grade
1 Wennie 4
FROM STUDENT, GRADES
2 Maulik 5
WHERE STUDENT.ID = GRADES.Student
3 Gurnaa 5 and GRADES.Course == ’1951A‘
4 z
Jens 4 and STUDENT.Year < GRADES.Tgt_Yr
5 Erin 4
GRADES
πGrade
Student Course Grade Tgt_Yr
1 32 A 1
σ(ID = Student)⋀(Course = 1951A)
2 1951A A 3 ⋀(Year < Tgt_Yr)
6 32 A 1
×
STUDENT GRADES
103
Clicker Question!
πGrade
(a) πGrade
(c)
σID = Student σID = Student ⋀ σYear < Tgt_Yr

× ×
σYear < Tgt_Yr σCourse = 1951A σCourse = 1951A
STUDENT

STUDENT GRADES GRADES

(b) πGrade

σYear < Tgt_Yr

×
σID = Student σCourse = 1951A

STUDENT 104 GRADES


Clicker Question!
πGrade
(a) πGrade
(c)
σID = Student σID = Student ⋀ σYear < Tgt_Yr

× ×
σYear < Tgt_Yr σCourse = 1951A σCourse = 1951A
STUDENT

STUDENT GRADES GRADES

(b) πGrade

σYear < Tgt_Yr

×
σID = Student σCourse = 1951A

STUDENT 105 GRADES


Clicker Question!
πGrade
(a) πGrade
(c)
σID = Student σID = Student ⋀ σYear < Tgt_Yr

× ×
σYear < Tgt_Yr σCourse = 1951A σCourse = 1951A
STUDENT

STUDENT GRADES GRADES

(b) πGrade
Depends on
σYear
output of
< Tgt_Yr

× join
σID = Student σCourse = 1951A

STUDENT 106 GRADES


Outline

• Catchup up from last lecture (more SQL keywords)

• NULLs

• Execution Order, Optimization

• Correlated Subqueries, More optimization

107
Nested Queries
STUDENT
ID Name Year GRADES
1 Wennie 4 Studen Cours GPA Tgt_Yr
2 Maulik 5 1 32 4.0 1
3 Gurnaa 5 2 1951A 3.5 3
4 z
Jens 4 6 32 2.8 1
5 Erin 4

SELECT s.Name
FROM STUDENT s
WHERE NOT EXISTS(
SELECT *
FROM GRADES
WHERE s.ID = STUDENT.ID
)

Find names students who are not in any


classes.
108
Nested Queries
STUDENT
ID Name Year GRADES
1 Wennie 4 Studen Cours GPA Tgt_Yr
2 Maulik 5 1 32 4.0 1
3 Gurnaa 5 2 1951A 3.5 3
4 z
Jens 4 6 32 2.8 1
5 Erin 4

Outer SELECT s.Name


FROM STUDENT s
Query WHERE NOT EXISTS(
SELECT * Inner
FROM GRADES
WHERE s.ID = STUDENT.ID Query
)

Find names students who are not in any


classes.
109
Nested Queries
STUDENT
ID Name Year GRADES
1 Wennie 4 Studen Cours GPA Tgt_Yr
2 Maulik 5 1 32 4.0 1
3 Gurnaa 5 2 1951A 3.5 3
Correlated!
4 z
Jens 4 6 32 2.8 1
Inner query 5 Erin 4
will return
SELECT s.Name
differently
FROM STUDENT s
for every WHERE NOT EXISTS(
row… SELECT *
FROM GRADES
WHERE s.ID = GRADES.Student
)

Find names students who are not in any


classes.
110
Nested Queries
STUDENT
ID Name Year GRADES
1 Wennie 4 Studen Cours GPA Tgt_Yr
2 Maulik 5 1 32 4.0 1
3 Gurnaa 5 2 1951A 3.5 3
4 z
Jens 4 6 32 2.8 1
Not
5 Erin 4
correlated!
Inner query SELECT s.Name
will always FROM STUDENT s
return the WHERE s.ID NOT IN(
SELECT Student
same thing. FROM GRADES
)

Find names students who are not in any


classes.
111
Nested Queries
STUDENT
ID Name Year How many courses is each student taking?
1 Wennie 4
2 Maulik 5 SELECT s.ID, s.Name,
3 Gurnaa 5 (SELECT COUNT(*) as num_courses
4 z
Jens 4 FROM GRADES g
5 Erin 4 WHERE s.ID = g.Student)
FROM STUDENT s
GRADES
Studen Cours GPA Tgt_Yr
1 32 4.0 1
2 1951A 3.5 3
6 32 2.8 1

112
Clicker Question!

STUDENT
ID Name Year How many courses is each student taking?
1 Wennie 4
2 Maulik 5 SELECT s.ID, s.Name,
3 Gurnaa 5 (SELECT COUNT(*) as num_courses
4 z
Jens 4 FROM GRADES g
5 Erin 4 WHERE s.ID = g.Student)
FROM STUDENT s
GRADES
Studen Cours GPA Tgt_Yr
1 32 4.0 1
2 1951A 3.5 3
6 32 2.8 1

Is this query correlated?


(a) uh huh (b) nuh uh
113
Clicker Question!

STUDENT
ID Name Year How many courses is each student taking?
1 Wennie 4
2 Maulik 5 SELECT s.ID, s.Name,
3 Gurnaa 5 (SELECT COUNT(*) as num_courses
4 z
Jens 4 FROM GRADES g
5 Erin 4 WHERE s.ID = g.Student)
FROM STUDENT s
GRADES
Studen Cours GPA Tgt_Yr Yes! This value will be
1 32 4.0 1
2 1951A 3.5 3 different for every row
6 32 2.8 1 (i.e. for every s.ID)
Is this query correlated?
(a) uh huh (b) nuh uh
114
Nested Queries
STUDENT
ID Name Year How many courses is each student taking?
1 Wennie 4
2 Maulik 5 SELECT s.ID, s.Name, c.num_courses
3 Gurnaa 5 FROM STUDENT s,
4 z
Jens 4 (SELECT Student,
5 Erin 4 COUNT(*) AS num_courses
FROM GRADES
GRADES GROUP BY Student) c
Studen Cours GPA Tgt_Yr WHERE s.ID = c.Student
1 32 4.0 1
2 1951A 3.5 3
6 32 2.8 1

115
Clicker Question!

STUDENT
ID Name Year How many courses is each student taking?
1 Wennie 4
2 Maulik 5 SELECT s.ID, s.Name, c.num_courses
3 Gurnaa 5 FROM STUDENT s,
4 z
Jens 4 (SELECT Student,
5 Erin 4 COUNT(*) AS num_courses
FROM GRADES
GRADES GROUP BY Student) c
Studen Cours GPA Tgt_Yr WHERE s.ID = c.Student
1 32 4.0 1
2 1951A 3.5 3
6 32 2.8 1

Is this query correlated?


(a) yeah sure (b) not really
116
Clicker Question!

STUDENT
ID Name Year How many courses is each student taking?
1 Wennie 4
2 Maulik 5 SELECT s.ID, s.Name, c.num_courses
3 Gurnaa 5 FROM STUDENT s,
4 z
Jens 4 (SELECT Student,
5 Erin 4 COUNT(*) AS num_courses
FROM GRADES
GRADES GROUP BY Student) c
Studen Cours GPA Tgt_Yr WHERE s.ID = c.Student
1 32 4.0 1
2 1951A 3.5 3
6 32 2.8 1

This value is always Is this query correlated?


the same, regardless
of the row
(a) yeah sure (b) not really
117
Rewriting Queries
How many courses is each student taking?

SELECT s.ID, s.Name,


(SELECT COUNT(*) as num_courses
FROM GRADES g
WHERE s.ID = g.Student)
FROM STUDENT s

118
Rewriting Queries
How many courses is each student taking?

SELECT s.ID, s.Name,


(SELECT COUNT(*) as num_courses
FROM GRADES g
WHERE s.ID = g.Student)
FROM STUDENT s

Executed for every row

119
Rewriting Queries
How many courses is each student taking?

SELECT s.ID, s.Name,


Only (SELECT COUNT(*) as num_courses
executed FROM GRADES g
WHERE s.ID = g.Student)
once
FROM STUDENT s

SELECT s.ID, s.Name, c.num_courses


FROM STUDENT s,
(SELECT Student, COUNT(*) as num_courses
FROM GRADES
GROUP BY Student) c
WHERE s.ID = c.Student

120
(non)Clicker Question!
Rewrite to remove the subquery altogether?
STUDENT
ID Name Year GRADES
1 Wennie 4 Studen Cours GPA Tgt_Yr
2 Maulik 5 1 32 4.0 1
3 Gurnaa 5 2 1951A 3.5 3
4 z
Jens 4 6 32 2.8 1
5 Erin 4

SELECT s.Name
FROM STUDENT s
WHERE EXISTS(
SELECT * FROM GRADES
WHERE s.ID = GRADES.Student
AND s.Year < GRADES.Tgt_Yr
)

Find students taking courses that are


above their
121 level.
(non)Clicker Question!
Rewrite to remove the subquery altogether?
STUDENT
ID Name Year GRADES
1 Wennie 4 Studen Cours GPA Tgt_Yr
2 Maulik 5 1 32 4.0 1
3 Gurnaa 5 2 1951A 3.5 3
4 z
Jens 4 6 32 2.8 1
5 Erin 4

SELECT s.Name HINT! Use a


FROM STUDENT s
WHERE EXISTS( Join Condition
SELECT * FROM GRADES
WHERE s.ID = GRADES.Student
AND s.Year < GRADES.Tgt_Yr
)

Find students taking courses that are


above their
122 level.
(non)Clicker Question!
Rewrite to remove the subquery altogether?
STUDENT
ID Name Year GRADES
1 Wennie 4 Studen Cours GPA Tgt_Yr
2 Maulik 5 1 32 4.0 1
3 Gurnaa 5 2 1951A 3.5 3
4 z
Jens 4 6 32 2.8 1
5 Erin 4

SELECT s.Name
FROM STUDENT s, GRADES g
WHERE s.ID = g.Student
AND s.Year < g.Tgt_Yr

Find students taking courses that are


above their
123 level.

You might also like