The document explains subqueries in SQL, detailing their definition, types, and usage in various clauses such as SELECT, FROM, WHERE, and HAVING. It covers scalar, single-row, multi-row, correlated, and non-correlated subqueries, providing examples and explanations for each type. Additionally, it offers best practices for optimizing and debugging subqueries.
The document explains subqueries in SQL, detailing their definition, types, and usage in various clauses such as SELECT, FROM, WHERE, and HAVING. It covers scalar, single-row, multi-row, correlated, and non-correlated subqueries, providing examples and explanations for each type. Additionally, it offers best practices for optimizing and debugging subqueries.
Understanding Nested Queries for Complex Data Retrieval
What is a Subquery? • A subquery is a query inside another query. • It returns a single value, or a table used by the main query. • Can be used in SELECT, WHERE, FROM, and HAVING. • Often enclosed in parentheses (). • Example: • SELECT ename, salary FROM employee WHERE salary > (SELECT AVG(salary) FROM employee); • Explanation: • The inner query gets the average salary. • The outer query retrieves employees earning above average. Types of Subqueries • 1⃣ Scalar Subquery → Returns a single value. • 2⃣ Single-Row Subquery → Returns one row with multiple columns. • 3⃣ Multi-Row Subquery → Returns multiple rows. • 4⃣ Correlated Subquery → Depends on the outer query. • 5⃣ Nested Subquery → Subquery inside another subquery. Scalar Subquery • Syntax: • SELECT column1, column2 FROM table_name WHERE column3 = (SELECT aggregate_function(column_name) FROM table_name); • Example: • SELECT ename, salary FROM employee WHERE salary = (SELECT MAX(salary) FROM employee); • Explanation: • Inner Query: (SELECT MAX(salary) FROM employee) • Retrieves the maximum salary from the employee table, and returns a single value (e.g., 5000). • Outer Query: • Fetches all employees (ename) and their salaries (salary). • Uses the result of the inner query to filter records (salary = MAX(salary)). • Output: • Displays the name and salary of the employee earning the highest salary in the table. Single-Row Subquery • Works with comparison operators such as =, >, <, etc. • Syntax • SELECT column1 FROM table1 WHERE column2 = (SELECT column3 FROM table2 WHERE condition); • Example • SELECT ename FROM employee WHERE department_id = ( SELECT department_id FROM department WHERE dname = 'IT’ ); • Explanation • Inner query retrieves the department_id for the IT department. • Outer query fetches employees belonging to that department. • Output: • Displays employees from the IT department. Multi-Row Subquery • Works with operators such as IN, ANY, ALL. • Syntax • SELECT column1 FROM table1 WHERE column2 IN (SELECT column3 FROM table2 WHERE condition); • Example • SELECT ename FROM employee WHERE department_id IN ( SELECT department_id FROM department WHERE dname LIKE 'Data%’ ); • Explanation • Inner query retrieves department_id values for departments starting with 'Data'. • Outer query fetches employees belonging to those departments. • Output: • Displays employees from departments like Data Science, Data Analysis, etc. Non-Correlated Subquery • Runs first and passes the result to the outer query. • Independent from the outer query. • Syntax: • SELECT column1 FROM table1 WHERE column2 OPERATOR ( SELECT column3 FROM table2 ); • Example: • SELECT ename FROM employee WHERE salary > ( SELECT AVG(salary) FROM employee ); • Explanation: • The subquery calculates the average salary from the employee table. • The outer query selects employees whose salary is greater than the calculated average. Correlated Subquery • Runs once per row in the outer query. • Uses values from the outer query. • Syntax: • SELECT column1 FROM table1 alias1 WHERE column2 OPERATOR ( SELECT aggregate_function(column3) FROM table2 alias2 WHERE alias1.column4 = alias2.column5 ); • Example: • SELECT e1.ename, e1.salary FROM employee e1 WHERE salary > ( SELECT AVG(e2.salary) FROM employee e2 WHERE e1.department_id = e2.department_id ); • Explanation: • The subquery calculates the average salary for employees in the same department as each employee in the outer query. • The outer query compares each employee's salary with the department's average. Subqueries in Different Clauses • Subqueries can appear in different clauses for various purposes: • 1⃣ SELECT Clause: Computes values and includes them in the result set. • 2⃣ FROM Clause: Treats the subquery result as a temporary table. • 3⃣ WHERE Clause: Filters records based on conditions. • 4⃣ HAVING Clause: Filters groups based on aggregate values. Subquery in SELECT Clause • Syntax: • SELECT column1, (SELECT aggregate_function(column2) FROM table2 WHERE condition) AS alias_name FROM table1; • Example: • SELECT ename, (SELECT MAX(salary) FROM employee) AS max_salary FROM employee; • Explanation: • The subquery calculates the maximum salary in the employee table. • The outer query lists each employee name and the maximum salary. Subquery in FROM Clause • Syntax: • SELECT column1, column2 FROM ( SELECT column3, aggregate_function(column4) AS alias_name FROM table2 GROUP BY column3 ) AS temp_table; • Example: • SELECT dept_name, total_salary FROM ( SELECT department_id, SUM(salary) AS total_salary FROM employee GROUP BY department_id ) AS dept_summary; • Explanation: • The inner query calculates the total salary for each department. • The outer query selects the department name and total salary from the temporary subquery result. Subquery in WHERE Clause • Syntax: • SELECT column1 FROM table1 WHERE column2 OPERATOR ( SELECT column3 FROM table2 WHERE condition ); • Example: • SELECT ename FROM employee WHERE salary > ( SELECT AVG(salary) FROM employee ); • Explanation: • The subquery calculates the average salary. • The outer query filters employees with salaries greater than the average. Subquery in HAVING Clause • Syntax: • SELECT column1, aggregate_function(column2) FROM table1 GROUP BY column1 HAVING aggregate_function(column2) OPERATOR ( SELECT aggregate_function(column3) FROM table2 WHERE condition ); • Example: • SELECT department_id, SUM(salary) FROM employee GROUP BY department_id HAVING SUM(salary) > ( SELECT AVG(salary) * 10 FROM employee ); • Explanation: • The subquery calculates ten times the average salary. • The outer query filters departments with total salaries exceeding that value. Best Practices: 1. Choose the Right Type of Subquery • Use Scalar Subqueries when retrieving a single value. • Use Multi-Row Subqueries with IN, ANY, ALL when handling multiple results. • Use Correlated Subqueries only when necessary (they execute for each row in the outer query). 2. Optimize Correlated Subqueries • If using correlated subqueries, limit the number of rows processed. 3. Use HAVING for Aggregate Subqueries • When filtering grouped data, use HAVING instead of WHERE. 4. Debug Step-by-Step • Break down complex subqueries into smaller queries to test performance. • Use EXPLAIN to analyze how the query runs. • EXPLAIN SELECT ename FROM employee WHERE salary > (SELECT AVG(salary) FROM employee);