0% found this document useful (0 votes)
19 views73 pages

Lecture 06 - SQL Queries

Uploaded by

xukunzh11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views73 pages

Lecture 06 - SQL Queries

Uploaded by

xukunzh11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

CS 5200: Database

Management Systems
Lecture 6: SQL Queries

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 1


Agenda
● Administrative notes
● Working with AUTO_INCREMENT keys
● SQL’s Data Manipulation Language (DML)
○ Basic queries
○ Grouping and aggregation
○ Advanced query features

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 2


Admin

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 3


Admin
● HW 3 published by the end of class Friday
○ Due by the start of class next week, Wednesday, Oct 16.
○ Late work on HW 3 not accepted
● Lecture next week: midterm review & PM1 presentations
○ Will discuss answers to HW3, among other things
● Midterm: Wednesday, Oct 23 or Thursday, Oct 24.
○ In-person: in normal classroom, during normal class hours
○ Pencil & paper; devices not permitted. I'll have paper; bring a writing implement.
○ You may bring notes (on paper!) with you.
● PM1 due Friday evening
● PM2 is available; due (with presentations) Wednesday, Oct 30.

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 4


PM1 Presentations
● Each team should choose a team member to present
○ We'll rotate through these in later milestones
● Plan for 5–10 minutes
● Be ready to display (portions of) your E-R diagram
○ I'll set up a Zoom meeting; presenters will open diagram on their machines & share screen
● Not enough time to describe entire design
● Pick a few interesting cases:
○ where your team had to decide between 2–3 reasonable alternatives, and why you chose your
final answer
○ places where you had to make assumptions due to ambiguities or missing details in the spec
● Be ready for (and with) questions!

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 5


Deadlines and Extensions
Reminder: deadlines are real.

Extensions are possible, but:

● Asking for an extension a few hours before (or after!) the deadline is too late
○ unless the nature of the emergency precludes being able to send a message
○ "I had the flu" does not prevent you logging in to Piazza
● "I forgot" is not an excuse.
● "I had to prepare for an interview" is not an excuse.

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 6


Working with AUTO_INCREMENT

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 7


Foreign Keys and AUTO_INCREMENT
Consider part of our blogging application:

BlogComment.postID refers to BlogPost.postID

BlogPost.postID is a good candidate to be AUTO_INCREMENT

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 8


Foreign Keys and AUTO_INCREMENT
● BlogPost.postID is a good candidate for AUTO_INCREMENT
● What about BlogComment.postID?
● Probably should not be AUTO_INCREMENT
○ Field's purpose is to refer to an existing BlogPost record
○ Assigning an arbitrary value to BlogComment.postID defeats this purpose

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 9


Inserting Records
● How do we insert records into these tables?
○ Ex: attempting to populate database with sample records for testing
● Do we let the database choose a value for BlogPost.postID?
○ INSERT INTO BlogPost (author, title, contents) VALUES ('jdoe', 'new pet', '...');
INSERT INTO BlogComment (postID, author, contents) VALUES (???, 'sdavis', '...');
○ What value do we supply for ???
● When working with INSERT statements by hand, must either:
○ query BlogPost after insert to find postID value & use that value in 2nd insert statement, or
○ bypass AUTO_INCREMENT and specify postID value for both records manually
● Up to you, though first option is not really practical in a .sql script

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 10


Inserting Records
● So, given all that, what's the point of AUTO_INCREMENT?
● When we get to JDBC in a few weeks, we'll see that there's a better option
○ We can ask the database to return the generated postID from the insertion operation.
○ That value is then available for use in subsequent insertions.

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 11


SQL SELECT

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 12


SELECT Statement
● Declarative queries to retrieve data
● Allows us to use all of the relational algebra operations we’ve discussed, not
just σ
● Example:
SELECT FirstName, LastName
FROM BlogUsers
WHERE DoB > '1990-02-05'

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 13


SELECT
The SELECT clause defines the columns in the resulting table. Options include:

● Projection: SELECT FirstName, LastName


● Renaming: SELECT FirstName AS First, LastName AS Last
● Defining new columns:
○ SELECT FirstName, 1 AS Const
○ SELECT MONTH(DoB) AS BirthMonth
○ SELECT CONCAT(FirstName, ' ', LastName) AS FullName
○ SELECT IF(MONTH(DoB) > 6 AND MONTH(DoB) < 9, 'Summer', 'NotSummer')
AS BirthSeason
● Shorthand: SELECT * FROM table

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 14


WHERE
The WHERE clause is relational algebra selection (σ): controls which rows appear
in the final answer.
● Arbitrary Boolean-valued expression
● Most DBMSs support a library of functions that can be used here
● Only rows for which expression evaluates to TRUE are included in the result
Examples:
● WHERE FirstName = 'Jae'
● WHERE CONCAT(FirstName, ' ', LastName) = 'Jae Yoon'
● WHERE MONTH(DoB) > 6 AND MONTH(DoB) < 9
NULL is tricky here; we’ll come back to this.

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 15


FROM
FROM clause specifies source table(s) for data, allowing join and cross-product
operations:

FROM BlogPosts INNER JOIN BlogComments


ON BlogPosts.postId = BlogComments.postId

Alternatively:
FROM BlogPosts, BlogComments
WHERE BlogPosts.postId = BlogComments.postId

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 16


FROM: Full Generality
● FROM table
● FROM table AS alias
● FROM (SELECT ...)
● FROM BlogPosts INNER JOIN BlogComments USING (PostId)
○ LEFT OUTER JOIN
○ RIGHT OUTER JOIN
○ CROSS JOIN
● FROM BlogPosts INNER JOIN BlogComments
ON BlogPosts.PostId = BlogComments.PostId
● FROM table, table, table

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 17


FROM: Full Generality
These can be nested:

FROM (SELECT ...) INNER JOIN table AS alias

FROM BlogPosts INNER JOIN BlogComments USING (postId)


INNER JOIN BlogUsers USING (username)

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 18


FROM: Full Generality
These can be nested:

FROM (SELECT ...) INNER JOIN table AS alias

FROM (BlogPosts INNER JOIN BlogComments USING (postId))


INNER JOIN BlogUsers USING (username)

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 19


FROM: Full Generality
These can be nested:

FROM (SELECT ...) INNER JOIN table AS alias

FROM (BlogPosts INNER JOIN BlogComments USING (postId))


INNER JOIN BlogUsers USING (username)

See the MySQL documentation for JOIN clauses for all the details.

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 20


UNION
Unlike other operators, it connects two SELECT statements:

SELECT PostId, Title FROM CatPosts


UNION
SELECT PostId, Title FROM DogPosts;

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 21


Column Names and Multiple Sources
SELECT PostId, Title, CommentId, Content
FROM BlogPosts INNER JOIN BlogComments
ON BlogPosts.PostId = BlogComments.PostId;

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 22


Column Names and Multiple Sources
SELECT PostId, Title, CommentId, Content
FROM BlogPosts INNER JOIN BlogComments
ON BlogPosts.PostId = BlogComments.PostId;

Error: PostId is ambiguous

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 23


Column Names and Multiple Sources
So qualify it:

SELECT BlogPosts.PostId, Title, CommentId, Content


FROM BlogPosts INNER JOIN BlogComments
ON BlogPosts.PostId = BlogComments.PostId;

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 24


Column Names and Multiple Sources
Can optionally qualify other columns:

SELECT BlogPosts.PostId, BlogPosts.Title,


BlogComments.CommentId, BlogComments.Content
FROM BlogPosts INNER JOIN BlogComments
ON BlogPosts.PostId = BlogComments.PostId;

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 25


NULLs, Expressions, and WHERE clauses
NULL can be tricky.

In most cases, an arithmetic expression, relational expression, or function call


evaluates to NULL whenever any of its arguments or operands is NULL:

● 3 + NULL
● CONCAT('xyz', NULL)
● x > 0 (when x is NULL)

If the expression in a WHERE clause evaluates to NULL for a row, that row is not
included in the result.

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 26


NULL: Exceptions
Important exceptions:

● <expr> IS NULL
● <expr> IS NOT NULL
● Logical operators AND, OR

AND TRUE FALSE NULL OR TRUE FALSE NULL

TRUE TRUE FALSE TRUE TRUE TRUE

FALSE FALSE FALSE FALSE TRUE FALSE

NULL NULL

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 27


NULL: Exceptions
Important exceptions:

● <expr> IS NULL
● <expr> IS NOT NULL
● Logical operators AND, OR

AND TRUE FALSE NULL OR TRUE FALSE NULL

TRUE TRUE FALSE TRUE TRUE TRUE

FALSE FALSE FALSE FALSE FALSE TRUE FALSE

NULL FALSE NULL

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 28


NULL: Exceptions
Important exceptions:

● <expr> IS NULL
● <expr> IS NOT NULL
● Logical operators AND, OR

AND TRUE FALSE NULL OR TRUE FALSE NULL

TRUE TRUE FALSE NULL TRUE TRUE TRUE

FALSE FALSE FALSE FALSE FALSE TRUE FALSE

NULL NULL FALSE NULL

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 29


NULL: Exceptions
Important exceptions:

● <expr> IS NULL
● <expr> IS NOT NULL
● Logical operators AND, OR

AND TRUE FALSE NULL OR TRUE FALSE NULL

TRUE TRUE FALSE NULL TRUE TRUE TRUE

FALSE FALSE FALSE FALSE FALSE TRUE FALSE

NULL NULL FALSE NULL NULL

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 30


NULL: Exceptions
Important exceptions:

● <expr> IS NULL
● <expr> IS NOT NULL
● Logical operators AND, OR

AND TRUE FALSE NULL OR TRUE FALSE NULL

TRUE TRUE FALSE NULL TRUE TRUE TRUE TRUE

FALSE FALSE FALSE FALSE FALSE TRUE FALSE

NULL NULL FALSE NULL NULL TRUE

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 31


NULL: Exceptions
Important exceptions:

● <expr> IS NULL
● <expr> IS NOT NULL
● Logical operators AND, OR

AND TRUE FALSE NULL OR TRUE FALSE NULL

TRUE TRUE FALSE NULL TRUE TRUE TRUE TRUE

FALSE FALSE FALSE FALSE FALSE TRUE FALSE NULL

NULL NULL FALSE NULL NULL TRUE NULL

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 32


NULL: Exceptions
Important exceptions:

● <expr> IS NULL
● <expr> IS NOT NULL
● Logical operators AND, OR

AND TRUE FALSE NULL OR TRUE FALSE NULL

TRUE TRUE FALSE NULL TRUE TRUE TRUE TRUE

FALSE FALSE FALSE FALSE FALSE TRUE FALSE NULL

NULL NULL FALSE NULL NULL TRUE NULL NULL

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 33


Differences between Tables and Relations
● Relational Algebra defined on relations: sets of tuples (i.e., no duplicates)
● SQL works on tables: duplicates are possible
● Can affect results in some cases:
○ SELECT x, y, z FROM ... vs SELECT DISTINCT x, y, z FROM ...
○ UNION:
■ (SELECT ...) UNION (SELECT ...)
■ (SELECT ...) UNION DISTINCT (SELECT ...)
■ (SELECT ...) UNION ALL (SELECT ...)
■ If not specified, default behavior of UNION is to remove duplicates (DISTINCT)

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 34


Query Evaluation
1. FROM clause: resolve table references, perform JOINs as
needed. Yields an intermediate result table. ➂ SELECT
2. WHERE clause: filter rows according to conditions. Yields ➀ FROM
an intermediate result table. ➁ WHERE
3. SELECT clause: filter, transform, add columns. Return this
result table as the result of the entire query.

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 35


Aggregation

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 36


Aggregation
Previously, we’ve computed new attributes based on existing columns in the same
row:

SELECT CONCAT(firstName, ' ', lastName) AS name


FROM BlogUsers;

Aggregation allows us to compute new attributes based on multiple rows:

SELECT firstName, COUNT(*) AS c


FROM BlogUsers
GROUP BY firstName;

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 37


GROUP BY Clause

SELECT FirstName
FROM BlogUsers
GROUP BY FirstName
UserName FirstName LastName DoB FirstName

}
jy Jae Yoon 2005-01-01 Jae

jo Jae O 1980-01-01 Tony

tony Tony Davidson 1996-01-01 Dan

dan Dan Kwan 1994-01-01 James

james James Marks 1990-01-01

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 38


GROUP BY Clause

SELECT FirstName, LastName


FROM BlogUsers
GROUP BY FirstName, LastName

UserName FirstName LastName DoB FirstName LastName

jy Jae Yoon 2005-01-01 Jae Yoon

jo Jae O 1980-01-01 Jae O

tony Tony Davidson 1996-01-01 Tony Davidson

dan Dan Kwan 1994-01-01 Dan Kwan

james James Marks 1990-01-01 James Marks

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 39


GROUP BY Clause

SELECT FirstName, COUNT(*) AS NumUsers


FROM BlogUsers
GROUP BY FirstName
UserName FirstName LastName DoB FirstName NumUsers

jy Jae Yoon 2005-01-01 Jae 2

jo Jae O 1980-01-01 Tony 1

tony Tony Davidson 1996-01-01 Dan 1

dan Dan Kwan 1994-01-01 James 1

james James Marks 1990-01-01

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 40


GROUP BY Clause

SELECT FirstName, MAX(DoB) AS Youngest


FROM BlogUsers
GROUP BY FirstName
UserName FirstName LastName DoB FirstName Youngest

jy Jae Yoon 2005-01-01 Jae 2005-01-01

jo Jae O 1980-01-01 Tony 1996-01-01

tony Tony Davidson 1996-01-01 Dan 1994-01-01

dan Dan Kwan 1994-01-01 James 1990-01-01

james James Marks 1990-01-01

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 41


Aggregations
Most SQL implementations provide 5 basic aggregations:

● MIN(expr)
● MAX(expr)
● SUM(expr)
● AVG(expr)
● COUNT(expr): number of rows in group where value of expr is not NULL
○ COUNT(DISTINCT expr): number of distinct non-NULL values of expr in group
○ COUNT(*): number of rows in group
○ COUNT(1): effectively, number of rows in group

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 42


Aggregations and NULL
All aggregations ignore NULL arguments. SELECT x, SUM(y) FROM T GROUP BY x;

x y x SUM(y)

A 2 A 12

A 7 B NULL

A NULL
If argument is null for all rows, result is NULL, as
A 3 in 2nd row above.

B NULL

B NULL

10/9/24: Lecture 6: SQL Queries CS 5200 Fall 2024 (Cobbe) 43


COUNT and NULL
COUNT is different. SELECT x, COUNT(y) FROM T
GROUP BY x;

x y x COUNT(y)

A 2 A 3

A 7 B 0

A NULL

A 3 COUNT never evaluates to NULL; it just returns


0.
B NULL

B NULL

10/9/24: Lecture 6: SQL Queries CS 5200 Fall 2024 (Cobbe) 44


GROUP BY…WITH ROLLUP
SELECT year, SUM(revenue) AS revenue
FROM sales
GROUP BY year;

year revenue

2000 4525

2001 3010

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 45


GROUP BY…WITH ROLLUP
SELECT year, SUM(revenue) AS revenue
FROM sales
GROUP BY year WITH ROLLUP;

year revenue

2000 4525

2001 3010

NULL 7535 Adds “super-aggregate” row covering all inputs.

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 46


GROUP BY…WITH ROLLUP
SELECT year, SUM(revenue) AS revenue
FROM sales
GROUP BY year WITH ROLLUP;

year revenue Here, value in rollup row is sum of previous two rows.

2000 4525 Value depends on aggregation function used.

2001 3010 If we’d said AVG(revenue), rollup row would have been average of all
sales records from 2000 and 2001.
NULL 7535

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 47


GROUP BY…WITH ROLLUP
SELECT year, year country product revenue

SUM(revenue) AS revenue 2000 Germany calculator 1500

FROM sales 2000 Germany TV 250

GROUP BY year, 2000 US computer 2500

country, 2001 US computer 3000

product;

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 48


GROUP BY…WITH ROLLUP
SELECT year, year country product revenue

SUM(revenue) AS revenue 2000 Germany calculator 1500

FROM sales 2000 Germany TV 250

GROUP BY year, 2000 Germany NULL 1750

country, 2000 US computer 2500

product 2000 US NULL 2500

WITH ROLLUP; 2001 US computer 3000

2001 US NULL 3000

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 49


GROUP BY…WITH ROLLUP
SELECT year, year country product revenue

SUM(revenue) AS revenue 2000 Germany calculator 1500

FROM sales 2000 Germany TV 250

GROUP BY year, 2000 Germany NULL 1750

country, 2000 US computer 2500

product 2000 US NULL 2500

WITH ROLLUP; 2000 NULL NULL 4250

2001 US computer 3000

2001 US NULL 3000

2001 NULL NULL 3000

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 50


GROUP BY…WITH ROLLUP
SELECT year, year country product revenue

SUM(revenue) AS revenue 2000 Germany calculator 1500

FROM sales 2000 Germany TV 250

GROUP BY year, 2000 Germany NULL 1750

country, 2000 US computer 2500

product 2000 US NULL 2500

WITH ROLLUP; 2000 NULL NULL 4250

2001 US computer 3000

2001 US NULL 3000

2001 NULL NULL 3000

NULL NULL NULL 7250

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 51


HAVING: post-GROUP filtering
SELECT FirstName, COUNT(*) as numUsers
FROM BlogUsers
GROUP BY FirstName
HAVING numUsers = 1;

UserName FirstName LastName DoB FirstName numUsers

jy Jae Yoon 2005-01-01 Tony 1

jo Jae O 1980-01-01 Dan 1

tony Tony Davidson 1996-01-01 James 1

dan Dan Kwan 1994-01-01

james James Marks 1990-01-01

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 52


HAVING
● WHERE filters before grouping: records that don’t meet condition aren’t
grouped
● HAVING filters after grouping: groups that don’t meet condition aren’t in result
● HAVING expression can only refer to GROUP columns, aggregate functions
● WHERE expression cannot refer to aggregate functions

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 53


Semijoins

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 54


Semijoins in SQL
CREATE TABLE BlogUser ( All users (first & last name) who posted last
username VARCHAR(255) PRIMARY KEY, year:
firstName VARCHAR(255),
πfirstName,lastName(
lastName VARCHAR(255)
BlogUser ⋉ σYEAR(created) = 2022(BlogPost)
);
)
CREATE TABLE BlogPost (
SELECT firstName, lastName
postID PRIMARY KEY AUTO_INCREMENT,
FROM BlogUser
username VARCHAR(255), WHERE username IN
created DATE, (SELECT username FROM BlogPost
title VARCHAR(255), WHERE YEAR(created) = 2022);
body TEXT
);

10/9/24: Lecture 6: SQL Queries CS 5200 Fall 2024 (Cobbe) 55


ORDER BY
● By default, order of rows in table produced by SELECT is undefined
● SELECT *
FROM BlogUser
ORDER BY FirstName (Order of Jae rows not specified)

UserName FirstName LastName DoB UserName FirstName LastName DoB

jy Jae Yoon 2005-01-01 dan Dan Kwan 1994-01-01

jo Jae O 1980-01-01 jy Jae Yoon 2005-01-01

tony Tony Davidson 1996-01-01 jo Jae O 1980-01-01

dan Dan Kwan 1994-01-01 james James Marks 1990-01-01

james James Marks 1990-01-01 tony Tony Davidson 1996-01-01

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 56


ORDER BY
● By default, order of rows in table produced by SELECT is undefined
● SELECT *
FROM BlogUser
ORDER BY FirstName DESC

UserName FirstName LastName DoB UserName FirstName LastName DoB

jy Jae Yoon 2005-01-01 tony Tony Davidson 1996-01-01

jo Jae O 1980-01-01 james James Marks 1990-01-01

tony Tony Davidson 1996-01-01 jy Jae Yoon 2005-01-01

dan Dan Kwan 1994-01-01 jo Jae O 1980-01-01

james James Marks 1990-01-01 dan Dan Kwan 1994-01-01

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 57


ORDER BY
● By default, order of rows in table produced by SELECT is undefined
● SELECT *
FROM BlogUser
ORDER BY FirstName, DoB

UserName FirstName LastName DoB UserName FirstName LastName DoB

jy Jae Yoon 2005-01-01 dan Dan Kwan 1994-01-01

jo Jae O 1980-01-01 jo Jae O 1980-01-01

tony Tony Davidson 1996-01-01 jy Jae Yoon 2005-01-01

dan Dan Kwan 1994-01-01 james James Marks 1990-01-01

james James Marks 1990-01-01 tony Tony Davidson 1996-01-01

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 58


LIMIT: restrict number of rows in result
SELECT * FROM BlogUser Which rows are returned is not specified, unless
you use ORDER BY
LIMIT 2;

UserName FirstName LastName DoB UserName FirstName LastName DoB

jy Jae Yoon 2005-01-01 jy Jae Yoon 2005-01-01

jo Jae O 1980-01-01 jo Jae O 1980-01-01

tony Tony Davidson 1996-01-01

dan Dan Kwan 1994-01-01

james James Marks 1990-01-01

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 59


LIMIT: restrict number of rows in result
SELECT * FROM BlogUser
LIMIT 10;

LIMIT clause specifies maximum result size

UserName FirstName LastName DoB UserName FirstName LastName DoB

jy Jae Yoon 2005-01-01 jy Jae Yoon 2005-01-01

jo Jae O 1980-01-01 jo Jae O 1980-01-01

tony Tony Davidson 1996-01-01 tony Tony Davidson 1996-01-01

dan Dan Kwan 1994-01-01 dan Dan Kwan 1994-01-01

james James Marks 1990-01-01 james James Marks 1990-01-01

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 60


LIMIT: restrict number of rows in result
SELECT * FROM BlogUser
LIMIT 3 OFFSET 1;

LIMIT clause specifies maximum result size

UserName FirstName LastName DoB UserName FirstName LastName DoB

jy Jae Yoon 2005-01-01 jo Jae O 1980-01-01

jo Jae O 1980-01-01 tony Tony Davidson 1996-01-01

tony Tony Davidson 1996-01-01 dan Dan Kwan 1994-01-01

dan Dan Kwan 1994-01-01

james James Marks 1990-01-01

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 61


Query Evaluation Order
1. FROM: build source table, including join operations
2. WHERE: filter records from source table 5 SELECT
3. GROUP BY: aggregation 1 FROM
4. HAVING: filter aggregate records 2 WHERE
5. SELECT: filter/rename/compute columns 3 GROUP BY
6. ORDER BY: sort records
4 HAVING
7. LIMIT: constrain result set
6 ORDER BY
7 LIMIT

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 62


MySQL Extensions

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 63


Expressions in GROUP BY
● MySQL allows expressions in GROUP BY
● Standard SQL requires a nested query
● Advice: follow MySQL for straightforward expressions
SELECT SUBSTRING(FirstName, 1, 1) AS FirstChar, COUNT(*) AS C
FROM BlogUsers
GROUP BY SUBSTRING(FirstName, 1, 1)
SELECT A.FirstChar, COUNT(*)
FROM (
SELECT SUBSTRING(FirstName, 1, 1) AS FirstChar
FROM BlogUsers) AS A
GROUP BY A.FirstChar

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 64


SELECT aliases in GROUP BY, HAVING
● MySQL allows GROUP BY, HAVING to refer to aliases defined in SELECT
● Standard SQ: use column/expr directly or a nested query
● Advice: favor MySQL for convenience
SELECT FirstName AS First, COUNT(*) AS Num
FROM BlogUsers
GROUP BY First
HAVING Num > 10
SELECT FirstName AS First, COUNT(*) AS Num
FROM BlogUsers
GROUP BY FirstName
HAVING COUNT(*) > 10

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 65


INSERT, UPDATE, DELETE

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 66


INSERT, UPDATE, DELETE
● Insert results of a SELECT query:
INSERT INTO tbl_name (col_name, …)
SELECT … FROM … WHERE …
● Delete records from multiple tables:
DELETE t1, t2 FROM t1 INNER JOIN t2 INNER JOIN t3
WHERE t1.id = t2.id AND t2.id = t3.id

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 67


Extended Examples

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 68


Extended Example: BlogApp
Under Files on Canvas, see lecture-notes/lecture-06-files/blogapp.sql for table
definitions, sample data.

For each day in Feb 2015, how many comments were posted?

SELECT DAY(created), COUNT(*)


FROM BlogComment
WHERE MONTH(created) = 2 AND YEAR(created) = 2015
GROUP BY DAY(created)
ORDER BY DAY(created);

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 69


Extended Example: BlogApp
Gets us the answer, but it only lists days on which we have comments.
What if we want to list all days in Feb 2015, with counts of 0 where appropriate?
Add a table with desired values:
CREATE TABLE Date201502 (
day DATE PRIMARY KEY
);
INSERT INTO Date201502 (day) VALUES
('2015-02-01'),
('2015-02-02'),
...,
('2015-02-28');

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 70


Extended Example: BlogApp
Then, we can compute Date201502 ⟕ BlogComment and group:

SELECT `day`, COUNT(commentId)


FROM Date201502 LEFT OUTER JOIN BlogComment
ON (`day` = DATE(created))
-- WHERE MONTH(created) = 2 AND YEAR(created) = 2015
GROUP BY `day`
ORDER BY `day`;

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 71


Exercises

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 72


Exercises
1. Are there any unpublished posts with comments?
2. Most recent ‘created’ timestamp of comments for each username.
3. Count of each username’s comments.
4. Usernames who have commented more than once.
5. Count of comments for each post in descending order; include post ID & title.
6. Username with the most unpublished posts; include status level.
7. Most reshared post; include post ID & title.
8. Number of comments per day for each username.
9. Avg number of comments per day for each username.
10. Usernames with more comments than posts. (hint: include all users)
11. Compare a user’s number of reshares to published posts (hint: include all users).
Who is more likely to read than write (reshares > posts), and vice versa (reshares <
posts)?

10/9/24: Lecture 6: SQL Queries CS 5200 Fall '24 (Cobbe) 73

You might also like