Writing Joins in MySQL
Writing Joins in MySQL
– OUTER
● LEFT, RIGHT, FULL OUTER
– INNER
● INNER, NATURAL, comma (,)
– CROSS
Subqueries
●
– DEPENDENT SUBQUERY
– DERIVED TABLE
● Changing a subquery to a JOIN
Example
● 6-week intensive course
● Homework every Friday
– Each assignment is 6% of your grade
– Lowest grade is dropped
– 30% of your grade, total
● Weekly tests every Monday
– Same grading structure as hw
● Midterm – Wed. 1/20 – 15% of your grade
● Final exam – Friday 2/12 – 25% of your grade
Sample data
● work table
CREATE TABLE work (
work_id tinyint(3) unsigned NOT NULL AUTO_INCREMENT,
wname varchar(255) DEFAULT NULL,
given date DEFAULT NULL,
pct_of_grade tinyint(3) unsigned NOT NULL,
PRIMARY KEY (work_id)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
Sample data
mysql> SELECT * FROM work;
+---------+---------+------------+--------------+
| work_id | wname | given | pct_of_grade |
+---------+---------+------------+--------------+
| 1 | hw1 | 2010-01-01 | 6 |
| 2 | test1 | 2010-01-04 | 6 |
| 3 | hw2 | 2010-01-08 | 6 |
| 4 | test2 | 2010-01-11 | 6 |
| 5 | hw3 | 2010-01-15 | 6 |
| 6 | test3 | 2010-01-18 | 6 |
| 7 | midterm | 2010-01-20 | 15 |
| 8 | hw4 | 2010-01-22 | 6 |
| 9 | test4 | 2010-01-25 | 6 |
| 10 | hw5 | 2010-01-29 | 6 |
| 11 | test5 | 2010-02-01 | 6 |
| 12 | hw6 | 2010-02-05 | 6 |
| 13 | test6 | 2010-02-08 | 6 |
| 14 | final | 2010-02-12 | 25 |
+---------+---------+------------+--------------+
Sample data
● student table
CREATE TABLE student (
student_id tinyint(3) unsigned NOT NULL AUTO_INCREMENT
name varchar(255) DEFAULT NULL,
email varchar(255) DEFAULT NULL,
PRIMARY KEY (student_id)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
● Entries
+------------+-----------------+------------------+
| student_id | name | email |
+------------+-----------------+------------------+
| 1 | Sheeri Cabral | [email protected] |
| 2 | Giuseppe Maxia | [email protected] |
| 3 | Colin Charles | [email protected] |
| 4 | Ronald Bradford | [email protected] |
+------------+-----------------+------------------+
Sample data
● student_work table
Create Table: CREATE TABLE student_work (
student_id tinyint(3) unsigned NOT NULL,
work_id tinyint(3) unsigned NOT NULL,
grade_num tinyint(3) unsigned DEFAULT NULL,
grade_letter char(2) DEFAULT NULL,
for_grade enum('y','n') DEFAULT 'y',
KEY student_id (student_id),
KEY work_id (work_id),
CONSTRAINT student_work_ibfk_1 FOREIGN KEY (student_id)
REFERENCES student (student_id),
CONSTRAINT student_work_ibfk_2 FOREIGN KEY (work_id)
REFERENCES work (work_id)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Sample data
● student_work entries
INSERT INTO student_work (student_id,work_id,grade_num)
VALUES
–- Sheeri had 88 for each hw/test except hw6 (72),
–- midterm 88, final 90, and she did not take test3.
(1,1,88),(1,2,88),(1,3,88),(1,4,88),(1,5,88),(1,7,88),
(1,9,88),(1,10,88),(1,11,88),(1,12,72),(1,13,88),
(1,14,90),
–- Giuseppe completed all assignments/tests:
(2,1,100),(2,2,100),(2,3,90),(2,4,88),(2,5,88),(2,6,85),
(2,7,95),(2,8,100),(2,9,100),(2,10,82),(2,11,85),
(2,12,89),(2,13,90),(2,14,96);
Sample data
● student_work entries
INSERT INTO student_work (student_id,work_id,grade_num)
VALUES
–- Colin is busy planning 2010 User Conference, and
–- did not complete any hw assignments, and as a result
–- did not do well on the tests
(3,2,75),(3,4,77),(3,6,89),(3,7,85),(3,9,72),(3,11,89),
(3,13,70),(3,14,80)
–- Ronald knew his stuff but got busy as the course
–- went on....
(4,1,100),(4,2,100),(4,3,95),(4,4,95),(4,5,90),(4,6,90),
(4,7,95),(4,8,85),(4,9,85),(4,10,80),(4,11,80),(4,12,75),
(4,13,75),(4,14,83);
Sample data
● Global grade_num_letter table
CREATE TABLE grade_num_letter (
grade_num tinyint(3) unsigned NOT NULL,
grade_letter char(2) NOT NULL DEFAULT '',
PRIMARY KEY (grade_num)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
● Same results
My Best Practices
● Don't use a comma to join
– Unexpected behavior with other JOINs in a query
● Never use JOIN; always use INNER JOIN or
CROSS JOIN
– Whoever debugs will know your intention
● Use a JOIN clause instead of a WHERE clause
– More clear what is a filter and what is a join
Getting the letter grades
student_work grade_num_letter
AS s AS g
student_id grade_num_letter
work_id s.grade_num=
grade_num_letter g.grade_num
for_grade
OUTER JOIN
● Show all rows that match on one side
● “get all grades for test3”
SELECT name, wname, grade_num
FROM student CROSS JOIN work
LEFT OUTER JOIN student_work
USING (student_id,work_id)
WHERE wname='test3';
OUTER JOIN
● LEFT OUTER JOIN
● “OUTER” is redundant
LEFT JOIN, inclusive
● Can be simulated
FULL OUTER JOIN, inclusive
● Write:
SELECT name,grade_num
FROM student NATURAL JOIN student_work
WHERE name='Sheeri Cabral';
NATURAL JOIN gone awry
●Having the same field names when the fields are
not equal:
SELECT sw.grade_num, gnl.grade_letter FROM
student_work AS sw INNER JOIN
grade_num_letter AS gnl USING (grade_num);
● So now what?
Get all grades for test3
● Start with:
SELECT name, grade_num FROM....
WHERE wname='test3';
Get all grades for test3
●We want a listing for each row in “work” against
each row in “student”
SELECT name, grade_num
FROM student CROSS JOIN work
....
WHERE wname='test3';
Get all grades for test3
●We want a listing for each row in “work” against
each row in “student”
How?
Drop the lowest test score
We expect 4 rows
Drop the lowest test score
This is hard!
Drop the lowest test score
Now add in the rest...
UPDATE student_work as upd
INNER JOIN student_work as sel USING
(student_id, work_id)
RIGHT JOIN student USING (student_id, work_id)
CROSS JOIN work
SET upd.for_grade='n' WHERE wname like 'test_'
AND upd.grade_num=min(sel.grade_num)
GROUP BY sel.student_id;
That doesn't work....
Sometimes you need a subquery.....
UPDATE student_work
SET for_grade='n' WHERE
CONCAT(student_id,work_id) IN (SELECT
CONCAT(student_id,work_id) FROM
student CROSS JOIN work
LEFT JOIN student_work USING (student_id,
work_id) WHERE wname like 'test_'
GROUP BY student_id);
That doesn't work either....
● Sometimes you need to do it in >1 query!
● Sometimes it's not necessary, but more optimal
● Problem is the min(grade_num)....GROUP BY
● So use a temporary table:
● CREATE TEMPORARY TABLE grade_to_drop
SELECT min(coalesce(grade_num,0)) FROM
student CROSS JOIN work LEFT JOIN student_work
USING (student_id,work_id) WHERE wname like
'test_' group by student_id;
Temporary table
Questions?
●
Comments?
●
01/29/10
Presented by:
Writing Joins in MySQL
Sheeri K. Cabral
01/29/10 1
Topics Covered
JOINs
●
– OUTER
● LEFT, RIGHT, FULL OUTER
– INNER
● INNER, NATURAL, comma (,)
– CROSS
Subqueries
●
– DEPENDENT SUBQUERY
– DERIVED TABLE
● Changing a subquery to a JOIN
2
Example
● 6-week intensive course
● Homework every Friday
– Each assignment is 6% of your grade
– Lowest grade is dropped
– 30% of your grade, total
● Weekly tests every Monday
– Same grading structure as hw
● Midterm – Wed. 1/20 – 15% of your grade
● Final exam – Friday 2/12 – 25% of your grade
3
● work table
CREATE TABLE work (
work_id tinyint(3) unsigned NOT NULL AUTO_INCREMENT,
wname varchar(255) DEFAULT NULL,
given date DEFAULT NULL,
pct_of_grade tinyint(3) unsigned NOT NULL,
PRIMARY KEY (work_id)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
4
Sample data
mysql> SELECT * FROM work;
+---------+---------+------------+--------------+
| work_id | wname | given | pct_of_grade |
+---------+---------+------------+--------------+
| 1 | hw1 | 2010-01-01 | 6 |
| 2 | test1 | 2010-01-04 | 6 |
| 3 | hw2 | 2010-01-08 | 6 |
| 4 | test2 | 2010-01-11 | 6 |
| 5 | hw3 | 2010-01-15 | 6 |
| 6 | test3 | 2010-01-18 | 6 |
| 7 | midterm | 2010-01-20 | 15 |
| 8 | hw4 | 2010-01-22 | 6 |
| 9 | test4 | 2010-01-25 | 6 |
| 10 | hw5 | 2010-01-29 | 6 |
| 11 | test5 | 2010-02-01 | 6 |
| 12 | hw6 | 2010-02-05 | 6 |
| 13 | test6 | 2010-02-08 | 6 |
| 14 | final | 2010-02-12 | 25 |
+---------+---------+------------+--------------+
5
Sample data
● student table
CREATE TABLE student (
student_id tinyint(3) unsigned NOT NULL AUTO_INCREMENT
name varchar(255) DEFAULT NULL,
email varchar(255) DEFAULT NULL,
PRIMARY KEY (student_id)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
● Entries
+------------+-----------------+------------------+
| student_id | name | email |
+------------+-----------------+------------------+
| 1 | Sheeri Cabral | [email protected] |
| 2 | Giuseppe Maxia | [email protected] |
| 3 | Colin Charles | [email protected] |
| 4 | Ronald Bradford | [email protected] |
+------------+-----------------+------------------+
6
Sample data
● student_work table
Create Table: CREATE TABLE student_work (
student_id tinyint(3) unsigned NOT NULL,
work_id tinyint(3) unsigned NOT NULL,
grade_num tinyint(3) unsigned DEFAULT NULL,
grade_letter char(2) DEFAULT NULL,
for_grade enum('y','n') DEFAULT 'y',
KEY student_id (student_id),
KEY work_id (work_id),
CONSTRAINT student_work_ibfk_1 FOREIGN KEY (student_id)
REFERENCES student (student_id),
CONSTRAINT student_work_ibfk_2 FOREIGN KEY (work_id)
REFERENCES work (work_id)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
7
Sample data
● student_work entries
INSERT INTO student_work (student_id,work_id,grade_num)
VALUES
–- Sheeri had 88 for each hw/test except hw6 (72),
–- midterm 88, final 90, and she did not take test3.
(1,1,88),(1,2,88),(1,3,88),(1,4,88),(1,5,88),(1,7,88),
(1,9,88),(1,10,88),(1,11,88),(1,12,72),(1,13,88),
(1,14,90),
–- Giuseppe completed all assignments/tests:
(2,1,100),(2,2,100),(2,3,90),(2,4,88),(2,5,88),(2,6,85),
(2,7,95),(2,8,100),(2,9,100),(2,10,82),(2,11,85),
(2,12,89),(2,13,90),(2,14,96);
8
Sample data
● student_work entries
INSERT INTO student_work (student_id,work_id,grade_num)
VALUES
–- Colin is busy planning 2010 User Conference, and
–- did not complete any hw assignments, and as a result
–- did not do well on the tests
(3,2,75),(3,4,77),(3,6,89),(3,7,85),(3,9,72),(3,11,89),
(3,13,70),(3,14,80)
–- Ronald knew his stuff but got busy as the course
–- went on....
(4,1,100),(4,2,100),(4,3,95),(4,4,95),(4,5,90),(4,6,90),
(4,7,95),(4,8,85),(4,9,85),(4,10,80),(4,11,80),(4,12,75),
(4,13,75),(4,14,83);
9
Sample data
● Global grade_num_letter table
CREATE TABLE grade_num_letter (
grade_num tinyint(3) unsigned NOT NULL,
grade_letter char(2) NOT NULL DEFAULT '',
PRIMARY KEY (grade_num)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
10
Venn Diagram
11
13
INNER JOIN
14
15
CROSS, INNER are semantic
● CROSS JOIN acting as an INNER JOIN:
SELECT s.grade_num, g.grade_letter
FROM student_work AS s
CROSS JOIN grade_num_letter AS g
ON (s.grade_num=g.grade_num);
16
In fact you do not need either!
● JOIN acting as an INNER JOIN:
SELECT s.grade_num, g.grade_letter
FROM student_work AS s
JOIN grade_num_letter AS g
ON (s.grade_num=g.grade_num);
17
JOIN clause
● ON (…) or USING(...)
● Same results
18
My Best Practices
● Don't use a comma to join
– Unexpected behavior with other JOINs in a query
● Never use JOIN; always use INNER JOIN or
CROSS JOIN
– Whoever debugs will know your intention
● Use a JOIN clause instead of a WHERE clause
– More clear what is a filter and what is a join
19
Getting the letter grades
student_work grade_num_letter
AS s AS g
student_id grade_num_letter
work_id s.grade_num=
grade_num_letter g.grade_num
for_grade
20
OUTER JOIN
● Show all rows that match on one side
● “get all grades for test3”
SELECT name, wname, grade_num
FROM student CROSS JOIN work
LEFT OUTER JOIN student_work
USING (student_id,work_id)
WHERE wname='test3';
21
OUTER JOIN
● LEFT OUTER JOIN
● “OUTER” is redundant
22
LEFT JOIN, inclusive
23
LEFT JOIN, exclusive
● Can be simulated
25
FULL OUTER JOIN, inclusive
27
FULL OUTER JOIN, exclusive
29
NATURAL JOIN example
● Instead of:
SELECT name,grade_num
FROM student INNER JOIN student_work USING
(student_id)
WHERE name='Sheeri Cabral';
● Write:
SELECT name,grade_num
FROM student NATURAL JOIN student_work
WHERE name='Sheeri Cabral';
30
NATURAL JOIN gone awry
●Having the same field names when the fields are
not equal:
SELECT sw.grade_num, gnl.grade_letter FROM
student_work AS sw INNER JOIN
grade_num_letter AS gnl USING (grade_num);
32
Procedural Thinking
● How to get the names and grades for test1
33
Procedural Thinking: Get the names
and grades for test1
● “Get names and grades”
SELECT name, grade_num
34
Procedural Thinking: Get the names
and grades for test1
SELECT name, grade_num
● “start with the join table”
FROM student_work
35
Procedural Thinking: Get the names
and grades for test1
SELECT name, grade_num
FROM student_work
● “get the name”
INNER JOIN student USING (student_id)
36
Procedural Thinking: Get the names
and grades for test1
SELECT name, grade_num
FROM student_work
INNER JOIN student USING (student_id)
● “But only get test1”
WHERE work_id IN (SELECT work_id FROM work
WHERE wname='test1')
37
Procedural Thinking: Get the names
and grades for test1
SELECT name, grade_num
FROM student_work
INNER JOIN student USING (student_id)
WHERE work_id IN (SELECT work_id FROM work
WHERE wname='test1');
38
Procedural Thinking: Get the names
and grades for test1
SELECT name, grade_num
FROM student_work
INNER JOIN student USING (student_id)
WHERE work_id IN (SELECT work_id FROM work
WHERE wname='test1');
39
Declarative Thinking: Get the names
and grades for test1
● I have 3 sets of data
● student has the name
● student_work has the grades
● work has the name of the assignment
SELECT name, grade_num FROM .... WHERE
wname='test1'
40
Declarative Thinking: Get the names
and grades for test1
● student and student_work relate by student_id
SELECT name, grade_num FROM
student INNER JOIN student_work USING
(student_id)
....
WHERE wname='test1';
41
Declarative Thinking: Get the names
and grades for test1
● work and student_work relate by work_id
SELECT name, grade_num FROM
student INNER JOIN student_work USING
(student_id)
INNER JOIN work USING (work_id)
WHERE wname='test1';
SELECT name, wname, grade_num
FROM student CROSS JOIN work
LEFT OUTER JOIN student_work
USING (student_id,work_id) 42
WHERE wname='test3';
But.....
● That falls apart for test3, because Sheeri did not
take test3.
● So now what?
43
Get all grades for test3
● Start with:
SELECT name, grade_num FROM....
WHERE wname='test3';
44
Get all grades for test3
●We want a listing for each row in “work” against
each row in “student”
SELECT name, grade_num
FROM student CROSS JOIN work
....
WHERE wname='test3';
45
Get all grades for test3
●We want a listing for each row in “work” against
each row in “student”
46
Get all grades for test3
● Grades might not exist for all the rows
● ...so we'll need an outer join
● Fill in the values from student_work for the grades
that do exist, joining on student_id and work_id:
SELECT name, grade_num
FROM student CROSS JOIN work
LEFT JOIN student_work USING (student_id,
work_id) WHERE wname='test3';
47
Drop the lowest test score
How?
48
Drop the lowest test score
49
Drop the lowest test score
50
Drop the lowest test score
We expect 4 rows
51
Drop the lowest test score
This is hard!
52
Drop the lowest test score
Now add in the rest...
UPDATE student_work as upd
INNER JOIN student_work as sel USING
(student_id, work_id)
RIGHT JOIN student USING (student_id, work_id)
CROSS JOIN work
SET upd.for_grade='n' WHERE wname like 'test_'
AND upd.grade_num=min(sel.grade_num)
GROUP BY sel.student_id;
53
That doesn't work....
Sometimes you need a subquery.....
UPDATE student_work
SET for_grade='n' WHERE
CONCAT(student_id,work_id) IN (SELECT
CONCAT(student_id,work_id) FROM
student CROSS JOIN work
LEFT JOIN student_work USING (student_id,
work_id) WHERE wname like 'test_'
GROUP BY student_id);
54
That doesn't work either....
● Sometimes you need to do it in >1 query!
● Sometimes it's not necessary, but more optimal
● Problem is the min(grade_num)....GROUP BY
● So use a temporary table:
● CREATE TEMPORARY TABLE grade_to_drop
SELECT min(coalesce(grade_num,0)) FROM
student CROSS JOIN work LEFT JOIN student_work
USING (student_id,work_id) WHERE wname like
'test_' group by student_id;
55
Temporary table
56
Temporary table
INSERT INTO grade_to_drop (student_id,
grade_num)
SELECT student_id,min(coalesce(grade_num,0))
FROM student CROSS JOIN work
LEFT JOIN student_work USING
(student_id,work_id)
WHERE wname like 'test_'GROUP BY student_id;
57
Temporary table
UPDATE grade_to_drop AS gtd INNER JOIN
student_work AS sw USING
(student_id,grade_num)
SET gtd.work_id = sw.work_id ;
58
Temporary table
59
More best practices
●EXPLAIN all your queries, and get the best “type”
possible
● Avoid JOIN hints (index hints, STRAIGHT_JOIN)
● Try to optimize subqueries into JOINs if possible
60
That's it!
Questions?
●
Comments?
●
61