0% found this document useful (0 votes)
11 views

Lecture7 Fall

database management lecture7

Uploaded by

Faruk Karagoz
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Lecture7 Fall

database management lecture7

Uploaded by

Faruk Karagoz
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

CSE 412 Database Management Lecture 7

SQL Intermediate
Jia Zou
Arizona State University

1
SQL Intermediate

2
Today’s Agenda
• Data Definition Language
• Data Manipulation Language
• Basic Queries (SELECT-FROM-WHERE)
• ORDER BY
• Set Operations

3
Practice
• HW2-Problem 11: Please return to me the distinct names of
suppliers, each of which has supplied at least two parts (Use
Self Join for Query and Relational Algebra)

4
Practice
• HW2-Problem 11: Please return to me the distinct names of
suppliers, each of which has supplied at least two parts (Use
Self Join for Query and Relational Algebra)

• Simplify the question a bit:


Please return to me the distinct keys of suppliers, each of which
has supplied at least two parts

5
Practice
• HW2-Problem 11: Please return to me the distinct names of
suppliers, each of which has supplied at least two parts (Use
Self Join for Query and Relational Algebra)

• Simplify the question a bit:


Please return to me the distinct keys of suppliers, each of which
has supplied at least two parts

To solve this simplified question, we will need a self-join on the


part-supp table.

6
Practice
Simplified Question:
• Please return to me the distinct keys of suppliers, each of which has
supplied at least two parts
We need a self-join on the part-supp table.

7
Practice
Simplified Question:
• Please return to me the distinct keys of suppliers, each of which has supplied
at least two parts
We need a self-join on the part-supp table.
Key Idea:
1. Rename the two part-supp tables (e.g., to PS1 and PS2 respectively)

8
Practice
Simplified Question:
• Please return to me the distinct keys of suppliers, each of which has supplied
at least two parts
We need a self-join on the part-supp table.
Key Idea:
1. Rename the two part-supp tables (e.g., to PS1 and PS2 respectively)
2. A self join on PS1.ps_suppkey = PS2.ps_suppkey AND PS1.ps_partkey <>
PS2.ps_partkey

9
Practice
Simplified Question:
• Please return to me the distinct keys of suppliers, each of which has supplied
at least two parts
We need a self-join on the part-supp table.
Key Idea:
1. Rename the two part-supp tables (e.g., to PS1 and PS2 respectively)
2. A self join on PS1.ps_suppkey = PS2.ps_suppkey AND PS1.ps_partkey <>
PS2.ps_partkey
3. Project on ps_suppkey

10
Idea 1: Renaming

ρps1 ρps2

PartSupp PartSupp

11
Idea 2: Self-Join

⨝ ps1.ps_suppkey=ps2.ps_suppkey ∧
ps1.ps_partkey<> ps2.ps_partkey

ρps1 ρps2

PartSupp PartSupp

12
Idea 3: Projection Πps1.ps_suppkey

⨝ ps1.ps_suppkey=ps2.ps_suppkey ∧
ps1.ps_partkey<> ps2.ps_partkey

ρps1 ρps2

PartSupp PartSupp

13
Now we have solved the simplified problem,
then how to solve the original problem?

HW2-Problem 11: Please return to me the distinct


names of suppliers, each of which has supplied at least
two parts (Use Self Join for Query and Relational
Algebra)

14
Now we have solved the simplified problem,
then how to solve the original problem?

• Idea: Join with the Supplier Table, which contains the names of suppliers

15
Join with the
supplier table ⨝ Supplier.s_suppkey =ps1.ps_suppkey

Supplier Πps1.ps_suppkey
ps1.ps_suppkey=ps2.ps_suppkey ∧
⨝ ps1.ps_partkey<> ps2.ps_partkey

ρps1 ρps2

PartSupp PartSupp
Is that sufficient?
⨝ Supplier.s_suppkey =ps1.ps_suppkey

Supplier Πps1.ps_suppkey
ps1.ps_suppkey=ps2.ps_suppkey ∧
⨝ ps1.ps_partkey<> ps2.ps_partkey

ρps1 ρps2

PartSupp PartSupp
The user wants Πs_name
supplier names,
we need run a
⨝ Supplier.s_suppkey =ps1.ps_suppkey

projection on Supplier Πps1.ps_suppkey


that!
ps1.ps_suppkey=ps2.ps_suppkey ∧
⨝ ps1.ps_partkey<> ps2.ps_partkey

ρps1 ρps2

PartSupp PartSupp
Final Answer to HW2-Problem 11
Relational Algebra: πs_name(Supplier ⨝s_suppkey=ps_suppkey
ρps1PartSupp ⨝ps1.ps_partkey!=ps2.ps_partkey ∧
ps1.ps_suppkey=ps2.ps_suppkeyρps2PartSupp)

SQL Query

19
HW2-Problem 12 will be similar!

20
Today’s Agenda
• Data Definition Language
• Data Manipulation Language
• Basic Queries (SELECT-FROM-WHERE)
• ORDER BY
• Set Operations
• Null Values

21
Missing Information
• Example: User (uid, name, age, pop)
• Value unknown
• We do not know Nelson’s age
• Value not applicable
• Suppose pop is based on interactions with others on our social networking site
• Nelson is new to our site; what is his pop?

22
Solution 1
• Dedicate a value from each domain (type)
• pop cannot be −1, so use −1 as a special value to
indicate a missing or invalid pop
• Leads to incorrect answers if not careful
• SELECT AVG(pop) FROM User;
• Complicates applications
• SELECT AVG(pop) FROM User WHERE pop <> -1;
• Perhaps the value is not as special as you
think!
• Ever heard of the Y2K bug? “00” was used as a
missing or invalid year value

23
Solution 2
• A valid-bit for every column
• User (uid, name,
name_is_valid ,
age, age_is_valid ,
pop, pop_is_valid )
• Complicates schema and queries
• SELECT AVG(pop) FROM User WHERE pop_is_valid ;

24
Solution 3
• Decompose the table; missing row = missing value
• UserName (uid, name)
• UserAge (uid, age)
• UserPop (uid, pop)
• UserID (uid)
• Still complicates schema and queries
• How to get all information about users in a table?
• Natural join doesn’t work!

25
SQL’s solution
• A special value NULL
• For every domain
• Special rules for dealing with NULL’s
• Example: User (uid, name, age, pop)
• <789, “Nelson”, NULL, NULL>

26
Computing with NULLs
• When we operate on a NULL and another value (including another
NULL) using +, −, etc., the result is NULL
• Aggregate functions ignore NULL, except COUNT(*) (since it counts
rows)

27
Three-valued Logic
• TRUE = 1, FALSE = 0, UNKNOWN = 0.5
• ! AND " = min(!, ")
• ! OR " = max(!, ")
• NOT ! = 1 − !
• When we compare a NULL with another value (including another
NULL) using = , >, etc., the result is UNKNOWN
• WHERE and HAVING clauses only select rows for output if the
condition evaluates to TRUE
• UNKNOWN is not enough

28
IS NULL/IS NOT NULL
• Example: Who has NULL pop values?
• SELECT * FROM User WHERE pop = NULL;
• Does not work; never returns anything
• SQL introduced special, built-in predicates IS NULL and IS NOT NULL
• SELECT * FROM User WHERE pop IS NULL;

29
Outerjoin motivation
First let’s see: what is the natural join result?

30
Outerjoin motivation
First let’s see: what is the natural join result?

Name EmpId DeptName Manager


⨝ =

31
Outerjoin motivation
First let’s see: what is the natural join result?

Name EmpId DeptName Manager


⨝ = Sally 2241 Sales Harriet
Harriet 2202 Sales Harriet

32
Outerjoin motivation
First let’s see: what is the natural join result?

Name EmpId DeptName Manager


⨝ = Sally 2241 Sales Harriet
Harriet 2202 Sales Harriet

Bad, other employee information get


lost, because their departments do
not have a record in the Dept table

33
Optional: Left Outer Natural Join

• In a left outer join, Employee rows without a matching Department


row appear in the result, but not vice versa.

34
Optional: Left Outer Natural Join
•⟕

• In a left outer join, Employee rows without a matching Department


row appear in the result, but not vice versa.

35
Today’s Agenda
• Data Definition Language
• Data Manipulation Language
• Basic Queries (SELECT-FROM-WHERE)
• ORDER BY
• Set Operations
• Null Values
• Aggregation

36
Aggregates
• Standard SQL aggregate functions: COUNT, SUM, AVG, MIN, MAX
• Example: number of users under 18, and their average popularity
• SELECT COUNT(*)
FROM User
WHERE age < 18;
• COUNT(*) counts the number of rows

37
Aggregates with Distinct
• Example: How many users are in some group?
• SELECT COUNT(DISTINCT uid)
FROM Member;
is equivalent to:
• SELECT COUNT(*)
FROM (SELECT DISTINCT uid FROM Member);

38
Practice
• Please return me the total number of parts of which the
retailing price is higher than 2000.

39
Practice
• Please return me the total number of parts of which the
retailing price is higher than 2000.

40
Practice
• I want to purchase all available instances of the part
named 'blush thistle blue yellow saddle' in the world.
How much supply cost in total should I pay for these
parts? (Each supplier supplies a different available
quantity of the part, and each supplier supplies the part
at a different supply cost)

41
Practice
• I want to purchase all available instances of the part
named 'blush thistle blue yellow saddle' in the world.
How much supply cost in total should I pay for these
parts? (Each supplier supplies a different available
quantity of the part, and each supplier supplies the part
at a different supply cost)

42
Grouping
• SELECT … FROM … WHERE …
GROUP BY list_of_columns

• Example: compute average popularity for each age


group
SELECT age, AVG(pop)
FROM User
GROUP BY age

43
Example of Grouping By
• SELECT age, AVG(pop) FROM User GROUP BY age;

44
Example of Aggregates (with no Group By)
• An aggregate query with no GROUP BY clause = all rows go into one
group
• SELECT AVG(pop) FROM User;
SELECT AVG(pop) AS avg_pop
FROM User

45
Having
• Used to filter groups based on the group properties (e.g., aggregate
values, GROUP BY column values)
• SELECT … FROM … WHERE … GROUP BY …
HAVING condition;

46
Having examples
• List the average popularity for each age group with more than a
hundred users

• Find average popularity for each age group over 10

47
Practice
• Please return me the total number of parts supplied by
suppliers whose account balance is below 0. The results should
contain two columns:
• Supplier key: SupplierKey
• Number of different types of parts supplied by this supplier: NumParts

48
Practice
• Please return me the total number of parts supplied by
suppliers whose supplier cost is below 100. The results should
contain two columns:
• Supplier key: SupplierKey
• Number of different types of parts supplied by this supplier: NumParts

Select ps_suppkey, count(*)


FROM PartSupp
WHERE ps_suppliercost < 100
GROUP BY ps_suppkey
49
Practice
• Please return me the total number of parts supplied by
suppliers whose account balance is below 0. The results should
contain two columns:
• Supplier key: SupplierKey
• Number of different types of parts supplied by this supplier: NumParts
• ONLY show these suppliers whose NumParts is larger than 10!

50
Practice
• Please return me the total number of parts supplied by
suppliers whose supplier cost is below 100. The results should
contain two columns:
• Supplier key: SupplierKey
• Number of different types of parts supplied by this supplier: NumParts

Select ps_suppkey, count(*)


FROM PartSupp
WHERE ps_suppliercost < 100
GROUP BY ps_suppkey
Having count(*) > 10 51
Practice
• Please return me the total number of parts supplied by
suppliers whose account balance is below 0. The results should
contain two columns:
• Supplier name: SupplierName
• Number of different types of parts supplied by this supplier: NumParts

52
Practice
• Please return me the total number of parts supplied by
suppliers whose account balance is below 0. The results should
contain two columns:
• Supplier Name
• Number of different types of parts supplied by this supplier: NumParts

53

You might also like