Lecture04 Subqueries
Lecture04 Subqueries
Define subqueries
Describe the type of problems that subqueries can solve
List the types of subqueries
Write single-row and multiple- row subqueries
This chapter covers the more advanced features of the SELECT statement.
You can write Subqueries in the where clause of another SQL statement to obtain values based on
an unknown conditional value.
This chapter covers single row subqueries and multiple row subqueries.
Solution:
2 steps
Find out how much Abel earns
That requires two queries. We need to pass information from the first query into the second query.
Writing two separate queries does not do that.
We need a Subquery to define Abel’s salary and pass it to the main query that produces the results.
Subquery Syntax
A Subquery is a SELECT statement that is imbedded in a clause of another SELECT statement.
Useful when you need to select rows from a table with a condition that depend so on data from the
same table or other tables.
Where used
On the following clauses:
WHERE clause
HAVING clause
FROM clause
ORDER of OPERATION
The Subquery generally executes first and its output is then the fed to the main or OUTER query.
The above slide shows how we solve the problem who earns more money than Abel.
Note that the Subquery executes first and returns the value 11,000.
ASIDE:
A better example would be to show the salary in the output.
Place the Subquery on the right side of the comparison operator for readability
You can do it the other way
SELECT * from employees
WHERE (select salary from employees where last_name = 'Abel') < salary
ORDER BY clause in the Subquery is only needed when performing TOP-N analysis
- Normally the order by clause is only found at the end of the SQL statement.
- TOP-N analysis refers two finding the top number of rows.
- Example top seven salaries
Types of Subqueries:
- Single-row Subqueries that return only one row from the inner SELECT statement
- Multiple-row Subqueries return more than one row from the inner SELECT statement
Special note:
There are Subqueries that return multiple columns. These are covered later.
Single-Row Subqueries:
For single row Subqueries that return only one row from the inner SELECT statement, single row
operators are used.
NOTE: you cannot use an equal to operator when you are comparing something to multiple rows.
PROBLEM:
Display the employees whose job ID is the same as that of employee 141
SOLUTION:
First find the job ID for employee 141
Use that job ID in the where clause to filter out the employees with the same job ID in the main
SELECT statement.
Note: I often write the inner or Subquery first to find what it returns, then I write the main query.
Many
subqueries
can be
used
QUERY BLOCKS
A SELECT statement is often called a query block.
In the above example there are 3 query blocks.
The inner query block executes first bringing back the results ST_CLERK and 2600
The outer query block is then processed as if the WHERE clause was hard coded with those values
that were returned from the inner query.
NOTE:
The Subquery can get information from different tables.
SOLUTION:
This example demonstrates that you can get information from the Subquery when the Subquery has
a group function in it.
NOTE:
SELECT LAST_NAME, JOB_ID, SALARY
FROM EMPLOYEES
WHERE SALARY = MIN (SALARY); can’t use group function here
Step 2-Since you want to find the minimum salary in other departments you need the group function
in the main query.
Step 3-But you want to limit which groups are displayed. That requires a HAVING statement
BUT … you do not want all of them. You want the ones that have a minimum greater than department
50
… lead to HAVING
SOLUTION
SELECT department_id, min(salary)
FROM employees
GROUP BY department_id
HAVING min(salary) > ( SELECT min(salary)
FROM employees
WHERE department_id = 50;)
Error:
More than one row is returned – you cannot be equal to more than one value
When you use a GROUP BY there is an implication that there will be multiple rows returned. In this
case the result of the Subquery is 7 rows returned. Each department ID in the employees table
generated a minimum salary.
COMMON PROBLEM:
The above statement is correct. It didn't return any rows from the Subquery. (no Haas exists)
The query passes a null value back to the right hand condition on the WHERE clause.
There is no job ID that is equal to NULL.
Therefore, no rows are selected
SPECIAL NOTE:
If there was a job ID with a NULL value then the left side value would be NULL, and the right side
value would be NULL. This means that NULL would be equal to NULL and the row would be
displayed.
For the row to be displayed, the WHERE clause must evaluate to TRUE
Because a comparison of two NULL values results in a NULL (instead of a 1 or 0) the WHERE
condition is not true
Multiple-Row Subqueries:
To use a Subquery that returns more than one row you need to use a Multiple-row operator
Multiple-Row Subqueries:
ANY clause
Looking at the outer query, the slide displays employees who are not IT programmers
And
whose salary is less than ANY salary that is returned by the inner Subquery
The inner Subquery sends back all the salaries for job ID equal to IT programmer.
The inner Subquery returns 3 salaries with values 9000, 6000 and 4200.
Since the outer query is looking for a salary less than ANY of the IT programmer salaries then it is
looking for a value that is less than 4200 and less than 6000 and less than 9000. In other words, it is
looking for a value less than the maximum value returned by the inner Subquery. The maximum
value is $9000.
This will then return IT_PROG also unless the final line is added to the query
NOTE:
< ANY -- less than any will mean less than the maximum return
> ANY -- greater than any means more than the minimum value returned
= ANY -- equal to any is the equivalent of the IN operator
ALL operator
The example on the slide displays employees whose salary is less than the salaries of all the
employees that have a job_id of IT_PROG
AND
whose job is not the IT_PROG
Again there are three values being returned. They are 9000, 6000 and 4200.
To be less than ALL means you have to be less than 4200
NOTE:
> ALL -- greater than all means more than the maximum
< ALL -- less than all means less than the minimum
NOTE:
The NOT operator can be used with any of these. Caution is recommended the use of the not
operator just as it was in other programming languages.
The subquery
SELECT mgr.manager_id
FROM employees mgr
One of the condition is a NULL value. The entire query returns no rows.
The problem is the NOT IN. The NOT IN is equivalent to <>ALL
SELECT last_name
FROM employees emp
WHERE emp.employee_id IN
(SELECT mgr.manager_id
FROM employees mgr);
IN is equivalent to =ANY
NOTE:
Could have added a WHERE clause in the Subquery WHERE manager_id is NOT NULL
SELECT last_name
FROM employees emp
WHERE emp.employee_id IN
(SELECT mgr.manager_id
FROM employees mgr);
WHERE manager_id is NOT NULL)
ASIDE:
Did we need the ALIAS table names?
No, it was done for readability
Prompt the user for the employee last name. The query will return last name and hire date of any
employee in the same department as the name supplied. Do not include the employee supplied.
SELECT department_id
FROM employees
WHERE last_name = ‘&Name’
Enter ZLOTKEY and it will find nothing. Should use function UPPER
SELECT department_id
FROM employees
WHERE UPPER(last_name) = UPPER('&Name')
UNDEFINE NAME;
A multiple-column subquery returns more than one column to the outer query and can be listed in the
outer query's FROM, WHERE, or HAVING clause. For example, the below query shows the
employee or employees in each department whose current salary is the lowest (or minimum) salary in
the department.
8 rows selected
NOTE: In department 90 is 2 people with the same minimum. Since both the sub and the full query
returned 8 rows, then there must be a row missing in the full query.
How would you fix this? Assuming the user wants to show the results where there is no department