Joins and Sub Queries
Joins and Sub Queries
JOINS
Definition
A join is a temporary relationship that you can create between two tables in a database query that do not already have an established relationship or common field with the same fieldname or data type. Database tables that are joined in a query are related in that query only, and nowhere else. The type of join that you use indicates which records the query will select or perform the chosen actions on. When you join tables, the type of join that you create affects the rows that appear in the result set. You can create the following types of joins:
Types of Joins
Inner join A join that displays only the rows that have a match in both joined tables. (This is the default type of join in the Query and View Designer.) For example, you can join the titles and publishers tables to create a result set that shows the publisher name for each title. In an inner join, titles for which you do not have publisher information are not included in the result set, nor are publishers with no titles. The resulting SQL for such a join might look like this: SELECT title, pub_name FROM titles INNER JOIN publishers ON titles.pub_id = publishers.pub_id Columns containing NULL do not match any values when you are creating an inner join and are therefore excluded from the result set. Null values do not match other null values.. Outer join A join that includes rows even if they do not have related rows in the joined table. You can create three variations of an outer join to specify the unmatched rows to be included: Left outer join All rows from the first-named table (the "left" table, which appears leftmost in the JOIN clause) are included. Unmatched rows in the right table do not appear. For example, the following SQL statement illustrates a left outer join between the titles and publishers tables to include all titles, even those you do not have publisher information for: SELECT titles.title_id, titles.title, publishers.pub_name FROM titles LEFT OUTER JOIN publishers
ON titles.pub_id = publishers.pub_id Right outer join All rows in the second-named table (the "right" table, which appears rightmost in the JOIN clause) are included. Unmatched rows in the left table are not included. For example, a right outer join between the titles and publishers tables will include all publishers, even those who have no titles in the titles table. The resulting SQL might look like this: SELECT titles.title_id, titles.title, publishers.pub_name FROM titles RIGHT OUTER JOIN publishers ON titles.pub_id = publishers.pub_id Full outer join All rows in all joined tables are included, whether they are matched or not. For example, a full outer join between titles and publishers shows all titles and all publishers, even those that have no match in the other table. SELECT titles.title_id, titles.title, publishers.pub_name FROM titles FULL OUTER JOIN publishers ON titles.pub_id = publishers.pub_id Cross Join A join whose result set includes one row for each possible pairing of rows from the two tables. For example, authors CROSS JOIN publishers yields a result set with one row for each possible author/publisher combination. The resulting SQL might look like this: SELECT * FROM authors CROSS JOIN publishers Self Join When a row in a table has to co-relate with other rows of the same table then use self join by aliasing the same table.An example solution query could be as follows: SELECT F.EmployeeID, F.LastName, S.EmployeeID, S.LastName, F.Country FROM Employee F INNER JOIN Employee S ON F.Country = S.Country
Equi Join A join that uses an * sign in the select list and displays redundant column data in the result set is termed as equi join. An equi-join, also known as an equijoin, is a specific type of comparatorbased join, or theta join, that uses only equality comparisons in the join-predicate. Using other comparison operators (such as <) disqualifies a join as an equi-join. SELECT * FROM employee JOIN department ON employee.DepartmentID = department.DepartmentID;
Natural Join A join that restricts the redundant column data from the result set is known as natural join. A natural join offers a further specialization of equi-joins. The join predicate arises implicitly by comparing all columns in both tables that have the same column-names in the joined tables. The resulting joined table contains only one column for each pair of equally-named columns. SELECT * FROM employee NATURAL JOIN department;
Join Columns
The JOIN operator matches rows by comparing values in one table with values in another. You decide which columns from each table should be matched. You have several choices: Related Columns Typically, you join tables by matching values in columns for which a foreign-key relationship exists. For example, you can join discounts to stores by matching the values of stor_id in the respective tables. The resulting SQL might look like this: SELECT * FROM discounts INNER JOIN stores ON stores.stor_id = discounts.stor_id Unrelated Columns You can also join tables by matching values in columns for which no foreign-key relationship exists. For example, you can join publishers to authors by matching the values of state in the
respective tables. Such a join yields a result set in which each row describes an author-publisher pair located in the same state. SELECT au_lname, au_fname, pub_name, authors.state FROM authors INNER JOIN publishers ON authors.state = publishers.state
Other You can match rows using some test other than equality. For example, to find the employees and the jobs for which they are underqualified, you can join employee with jobs, matching rows in which the job's minimum required level exceeds the employee's job level. The resulting SQL might look like this: SELECT fname, minit, lname, job_desc, job_lvl, min_lvl FROM employee INNER JOIN jobs ON employee.job_lvl < jobs.min_lvl
Join Tables
When combining data from multiple tables, you must decide what tables to use. There are several noteworthy considerations: Combining Three or More Tables Each JOIN operation combines two tables. However, you can use multiple JOIN operations within one query to assemble data from any number of tables. Because the result of each JOIN operation is effectively a table, you can use that result as an operand in a subsequent join operation. For example, to create a result set in which each row contains a book title, an author, and the percentage of that book's royalties the author receives, you must combine data from three tables: authors, titles, and titleauthor. The resulting SQL might look like this: SELECT title, au_fname, au_lname, royaltyper FROM authors INNER JOIN titleauthor ON authors.au_id = titleauthor.au_id INNER JOIN titles ON titleauthor.title_id = titles.title_id
Using a Table merely to join others You can include a table in a join even if you do not want to include any of that table's columns in a result set. For example, to establish a result set in which each row describes a title-store pair in which that store sells that title, you include columns from two tables: titles, and stores. But you must use a third table, sales, to determine which stores have sold which titles. The resulting SQL might look like this: SELECT title, stor_name FROM titles INNER JOIN sales ON titles.title_id = sales.title_id INNER JOIN stores
ON sales.stor_id = stores.stor_id Notice that the sales table contributes no columns to the result set. Using a table twice in one query You can use the same table two (or more) times within a single query. Using something else in place of a table In place of a table, you can use a query, a view, or a user-defined function that returns a table.
SUBQUERIES
Definition
A subquery is a query that is nested inside a SELECT, INSERT, UPDATE, or DELETE statement, or inside another subquery. A subquery can be used anywhere an expression is allowed. In this example a subquery is used as a column expression named MaxUnitPrice in a SELECT statement. USE AdventureWorks2008R2; GO SELECT Ord.SalesOrderID, Ord.OrderDate, (SELECT MAX(OrdDet.UnitPrice) FROM AdventureWorks.Sales.SalesOrderDetail AS OrdDet WHERE Ord.SalesOrderID = OrdDet.SalesOrderID) AS MaxUnitPrice FROM AdventureWorks2008R2.Sales.SalesOrderHeader AS Ord A subquery is also called an inner query or inner select, while the statement containing a subquery is also called an outer query or outer select.Many Transact-SQL statements that include subqueries can be alternatively formulated as joins. Other questions can be posed only with subqueries. In Transact-SQL, there is usually no performance difference between a statement that includes a subquery and a semantically equivalent version that does not. However, in some cases where existence must be checked, a join yields better performance. Otherwise, the nested query must be processed for each result of the outer query to ensure elimination of duplicates. In such cases, a join approach would yield better results. The following is an example showing both a subquery SELECT and a join SELECT that return the same result set: /* SELECT statement built using a subquery. */ SELECT Name
FROM AdventureWorks2008R2.Production.Product WHERE ListPrice = (SELECT ListPrice FROM AdventureWorks2008R2.Production.Product WHERE Name = 'Chainring Bolts' ); /* SELECT statement built using a join that returns the same result set. */ SELECT Prd1. Name FROM AdventureWorks2008R2.Production.Product AS Prd1 JOIN AdventureWorks2008R2.Production.Product AS Prd2 ON (Prd1.ListPrice = Prd2.ListPrice) WHERE Prd2. Name = 'Chainring Bolts'; A subquery nested in the outer SELECT statement has the following components:
A regular SELECT query including the regular select list components. A regular FROM clause including one or more table or view names. An optional WHERE clause. An optional GROUP BY clause. An optional HAVING clause.
The SELECT query of a subquery is always enclosed in parentheses. It cannot include a COMPUTE or FOR BROWSE clause, and may only include an ORDER BY clause when a TOP clause is also specified. A subquery can be nested inside the WHERE or HAVING clause of an outer SELECT, INSERT, UPDATE, or DELETE statement, or inside another subquery. Up to 32 levels of nesting is possible, although the limit varies based on available memory and the complexity of other expressions in the query. Individual queries may not support nesting up to 32 levels. A subquery can appear anywhere an expression can be used, if it returns a single value. If a table appears only in a subquery and not in the outer query, then columns from that table cannot be included in the output (the select list of the outer query). Statements that include a subquery usually take one of these formats:
WHERE expression [NOT] IN (subquery) WHERE expression comparison_operator [ANY | ALL] (subquery) WHERE [NOT] EXISTS (subquery)
In some Transact-SQL statements, the subquery can be evaluated as if it were an independent query. Conceptually, the subquery results are substituted into the outer query (although this is not necessarily how Microsoft SQL Server actually processes Transact-SQL statements with subqueries).
Operate on lists introduced with IN, or those that a comparison operator modified by ANY or ALL. Are introduced with an unmodified comparison operator and must return a single value. Are existence tests introduced with EXISTS.
Subquery Rules
A subquery is subject to the following restrictions:
The select list of a subquery introduced with a comparison operator can include only one expression or column name (except that EXISTS and IN operate on SELECT * or a list, respectively). If the WHERE clause of an outer query includes a column name, it must be joincompatible with the column in the subquery select list. The ntext, text, and image data types cannot be used in the select list of subqueries. Because they must return a single value, subqueries introduced by an unmodified comparison operator (one not followed by the keyword ANY or ALL) cannot include GROUP BY and HAVING clauses. The DISTINCT keyword cannot be used with subqueries that include GROUP BY. The COMPUTE and INTO clauses cannot be specified. ORDER BY can only be specified when TOP is also specified. A view created by using a subquery cannot be updated. The select list of a subquery introduced with EXISTS, by convention, has an asterisk (*) instead of a single column name. The rules for a subquery introduced with EXISTS are the same as those for a standard select list, because a subquery introduced with EXISTS creates an existence test and returns TRUE or FALSE, instead of data.
Subquery Types
Subqueries can be specified in many places:
With aliases. . With IN or NOT IN. In UPDATE, DELETE, and INSERT statements. With comparison operators. With ANY, SOME, or ALL. With EXISTS or NOT EXISTS. In place of an expression.
Correlated Subqueries
Many queries can be evaluated by executing the subquery once and substituting the resulting value or values into the WHERE clause of the outer query. In queries that include a correlated subquery (also known as a repeating subquery), the subquery depends on the outer query for its
values. This means that the subquery is executed repeatedly, once for each row that might be selected by the outer query. This query retrieves one instance of each employee's first and last name for which the bonus in the SalesPerson table is 5000 and for which the employee identification numbers match in the Employee and SalesPerson tables. USE AdventureWorks2008R2; GO SELECT DISTINCT c.LastName, c.FirstName, e.BusinessEntityID FROM Person.Person AS c JOIN HumanResources.Employee AS e ON e.BusinessEntityID = c.BusinessEntityID WHERE 5000.00 IN (SELECT Bonus FROM Sales.SalesPerson sp WHERE e.BusinessEntityID = sp.BusinessEntityID) ; GO
References
http://msdn.microsoft.com/en-us/library/ms189575.aspx http://technofriends.in/2008/02/13/what-are-joins-in-database/