0% found this document useful (0 votes)
2 views

sql correlated subqueries - advanced sql _ bipp analytics

The document explains SQL correlated subqueries, which are executed for each record returned by an outer query and are essential for certain data questions, particularly negative data queries. It contrasts correlated subqueries with simple subqueries, emphasizing that correlated subqueries should be used sparingly due to performance concerns. The document provides examples of both negative and positive data queries, highlighting best practices for using JOINs instead of correlated subqueries when possible.

Uploaded by

Yanet Cesaire
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

sql correlated subqueries - advanced sql _ bipp analytics

The document explains SQL correlated subqueries, which are executed for each record returned by an outer query and are essential for certain data questions, particularly negative data queries. It contrasts correlated subqueries with simple subqueries, emphasizing that correlated subqueries should be used sparingly due to performance concerns. The document provides examples of both negative and positive data queries, highlighting best practices for using JOINs instead of correlated subqueries when possible.

Uploaded by

Yanet Cesaire
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Platform Features Pricing Resources Learn Company Connect Sign in Try for free

Search SQL Tutorial

SQL Tutorial > Advanced SQL > SQL Correlated Subqueries

SQL Correlated Subqueries Increase the Power of


SQL
A SQL correlated subquery is a query which is executed one time for each record IN THIS PAGE
returned by the outer query. It is called correlated as it is a correlation between the
number of times the subquery is executed with the number of records returned by the Using SQL
Correlated
outer query (not the subquery). Subqueries
Correlated vs
Correlated subqueries are a different type of subquery. In some situations, you must Simple Subqueries
use a correlated subquery to obtain the correct reply or result data set. This lesson What About
Positive Data
covers how to identify the situations that require a correlated subquery. There are Questions?
examples of correlated subqueries, and a discussion about overusing correlated A Complex
subqueries. Example Using a
Correlated
Subquery

Using SQL Correlated Subqueries Closing Words

There are some data questions where the SQL to solve the question must include a
correlated subquery. This is particularly true in queries where you are looking for what
might be called negatives or negative data questions.

These examples use two tables: employee and payment_history. The


payment_history table has a column called payment_type that indicates if a
payment is part of a regular salary, a bonus, or an award. Here is some sample data for
the tables:

Table employee

employee_id first_name last_name

10100 Marcos Bisset

10101 Kate Perez


10102 Carlos Casco

Table payment_history

employee_id payment_type amount_paid payment_date


10100 monthly pay 12000.00 2018-02-02

10101 monthly pay 2800.00 2018-02-02

10102 monthly pay 1900.00 2018-03-02


10101 bonus 1500.00 2018-03-08

10102 adjustment 124.70 2018-03-10

Here is an SQL query to answer a negative data question: List the employees who
have never received a bonus.

Copy

SELECT last_name, first_name


FROM employee e1
WHERE NOT EXISTS (SELECT ph.last_name
FROM payment_history ph
WHERE ph.employee_id = e1.employee_id
AND ph.payment_type = ‘bonus’
)

The result:

first_name last_name
first_name last_name
Marcos Bisset

Carlos Casco

Correlated vs Simple Subqueries


The main difference between correlated and simple subqueries is that correlated
subqueries reference columns from the outer table. In the example,
ph.employee_id = e1.employee_id is a reference to the outer subquery table. To
identify a correlated query, just look for these kinds of references. If you find at least
one, you have a correlated subquery!

The negative part of the data question is often solved in the correlated subquery by
using a NOT EXISTS operator in the WHERE clause. EXISTS is an operator always
followed by a subquery. If the subquery returns at least one record, then the EXISTS
evaluates to TRUE. If the subquery returns empty, then the EXISTS evaluates to
FALSE. Note we use NOT EXISTS, which is the opposite to EXISTS.

What About Positive Data Questions?


Should you use a correlated subquery for a positive data question? No. You can if you
want to, but a JOIN condition or a relationship between two tables to answer data
questions is a better approach for positive data questions.

Here is an SQL query to answer a positive data question: List the employees who have
received a bonus.

Copy

SELECT first_name, last_name


FROM employee e1, payment_history ph
WHERE ph.employee_id = e1.employee_id
AND ph.payment_type = ‘bonus’

The query uses an implicit JOIN by putting both tables in the FROM clause. The implicit
JOIN is equivalent to the query:

Copy

SELECT first_name, last_name


FROM employee e1
INNER JOIN payment_history ph ON ph.employee_id = e1.emp
loyee_id
WHERE ph.payment_type = ‘bonus’

The result is:

first_name last_name
Kate Perez

It is a best practice to solve data question with JOIN instead of a correlated subquery
whenever possible. You should only use correlated subqueries for negative data
questions or other data questions where a correlated subquery is the only way to
answer it.

Performance is the biggest reason to avoid correlated subqueries. They tend to be


really slow. Avoid them when you can.

A Complex Example Using a Correlated


Subquery
In this example, obtain the employee names who made a higher salary in March 2018
than their average monthly salary for all previous months.

Copy
SELECT first_name, last_name
FROM employee e1, payment_history ph
WHERE e1.employee_id = ph.employee_id
AND amount_paid > = (
SELECT avg(amount_pa
id)
FROM payment_histo
ty ph2
WHERE ph2.employee_
id = e1.employee_id
AND ph2.payme
nt_date < ‘2018-03-01’
)
AND month(ph.payment_date) = 3
AND year(ph.payment_date) = 2018
AND ph.payment_type = ‘salary’

You can identify the reference to the outer subquery table marked in red, so you know
this is a correlated subquery. The subquery executes once per record instead of just a
single time like a simple query.

Closing Words
In this lesson you learned how to identify and use correlated subqueries, a special kind
of subquery, which is used under specific circumstances. The next lesson covers SQL
Window Functions: very powerful functions that add new possibilities to the SQL
language. Keep going, learn SQL and increase your skills!

SQL String Functions SQL Window Functions

Sign up for a Connect to leading SQL


databases
Build visual data models
without learning a new
Create two dashboards
and ten reports for free
forever free language

account!
Sign up for free

Platform Features Resources Learn Company Connect

Business Intelligence bipp Data Modeling Layer Blog Documentation About Request Demo

Embedded Analytics Visual SQL Data Explorer Reports bipp Tutorial Meet The Team Support

Professional Services Data Visualization Release Notes Why bipp? Careers Contact Us

Security Git-based Version Control Community SQL Tutorial News

Pricing In-Database Migrate from Chartio Definitions Case studies

SQL Editor Compare Disclosure


Scheduled Delivery and Privacy
Alerts
Terms
Real-Time BI

© 2022 bipp Inc.

You might also like