Everything You Wanted To Know About Joins But..!!!
Mark N. Jauss NCR Corporation
Key Areas
What are Joins Types of Joins Examples of Joins Potential problems with Joins
The Basics -- What Are Joins?
A concept in relational database theory Technique for accessing 2+ tables in a single answer set Each answer set row may contain data from each table Joined on join column
ANSI 89 -vs- ANSI 92
ANSI 89 Standard
Sometimes called the Where version No Join command necessary
ANSI 89 -vs- ANSI 92
ANSI 92 Standard
Requires a Join command Requires a From clause All factors equal, performance the same
Types Of Joins
We will discuss the following Joins: Inner Join Outer Joins Full Left Right Cross Join Self Join
Inner Joins
Rows which match based on join criteria Example:
Select employee.lname, employee.fname From employee Inner Join department On employee.departmentnum = department.departmentnum;
Outer Joins
Full: Both tables used to qualify and both extended with Nulls Left: Left table used to qualify and right table has Nulls for non-matching rows Right: Right table used to qualify and left table has Nulls for non-matching rows
Full Outer Join
Example:
Select employee.lname, employee.fname From employee Full Outer Join department On employee.departmentnum=department.departmentnum;
Left Outer Join
Example:
Select employee.lname, employee.fname From employee Left Outer Join department On employee.departmentnum = department.departmentnum;
Right Outer Join
Example:
Select employee.lname, employee.fname From employee Right Outer Join department On employee.departmentnum = department.departmentnum;
Cross Join
Each row of one table matched with each row of another table (no ON clause) Example:
Select employee.lname, employee.fname From employee Cross Join department Where employee.employee_number = 1008;
Self Join
Rows matching other rows within the same table Example:
Select employee.lname, employee.fname From employee emp Inner Join employee mgr On emp.departmentnum=mgr.departmentnum;
Alias Names
Temporary name for Table or View Defined in From clause Useful as abbreviation Required in Self Join Once defined, must be used in SQL where table name required
Multiple Table Joins
A join can have up to 64 participating tables An n-table join requires n-1 join conditions Omitting a join condition will result in a Cartesian Product Join
Multiple Table Joins
Example:
Select employee.lname, employee.fname From employee Inner Join department On employee.departmentnum = department.departmentnum Inner Join job On employee.jobcode = job.jobcode;
Potential Problems With Joins
Generating a Cartesian Product Join (spool problems)
How:
Cross Join with no Where clause Improper aliasing Omitting a Join condition Improper Join condition
Potential Problems With Joins
Improper aliasing:
Select employee.lname, employee.fname From employee emp Inner Join department dep On emp.departmentnum = dep.departmentnum
Potential Problems With Joins
Omitting a Join condition:
Select employee.lname, employee.fname From employee Inner Join department On employee.departmentnum = department.departmentnum Inner Join job;
Potential Problems With Joins
Improper Join condition:
Select employee.lname, employee.fname From employee Inner Join department On 3 = 3;
Comparing Subqueries To Joins
Subquery answer set only displays from outer query table Subquery easier to write (?) All things equal, Joins are usually better performers
Sample Explain Of Subquery
EXPLAIN SELECT first_name ,last_name ,department_number FROM employee WHERE department_number IN (SELECT department_number FROM department WHERE department_name LIKE %Research%);
Sample Explain Output Of Subquery
5) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an all-rows scan, which is joined to CUSTOMER_SERVICE.department with a condition of ("CUSTOMER_SERVICE.department.department_name LIKE '%Research%'"). Spool 2 and CUSTOMER_SERVICE.department are joined using a merge join, with a join condition of ("Spool_2.department_number CUSTOMER_SERVICE.department.department_number"). The result goes into Spool 1, which is built locally on the AMPs. The size of Spool 1 is estimated to be 24 rows. The estimated time for this step is 0.18 seconds. -> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0 hours and 0.24 seconds.
Sample Explain Of Join
EXPLAIN SELECT employee.first_name ,employee.last_name ,employee.department_number FROM employee INNER JOIN department ON employee.department_number = department.department_number WHERE department.department_name LIKE %Research%;
Sample Explain Output Of Join
5) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an all-rows scan, which is joined to CUSTOMER_SERVICE.department with a condition of ("CUSTOMER_SERVICE.department.department_name LIKE '%Research%'"). Spool 2 and CUSTOMER_SERVICE.department are joined using a merge join, with a join condition of ("Spool_2.department_number CUSTOMER_SERVICE.department.department_number"). The result goes into Spool 1, which is built locally on the AMPs. The size of Spool 1 is estimated to be 24 rows. The estimated time for this step is 0.18 seconds. 6) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request. -> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0 hours and 0.23 seconds
Summary
Several different types of Joins
Inner Join Full Outer Join Left Outer Join Right Outer Join Cross Join Self Join
Summary
Potential problems
Improper aliasing Improper Join condition Omitting a Join condition
Subqueries -vs- Joins Question and Answer