SQL Joins Tutorial: Cross Join, Full Outer Join, Inner Join, Left Join, and Right Join
SQL Joins Tutorial: Cross Join, Full Outer Join, Inner Join, Left Join, and Right Join
John Mosesman
SQL joins allow our relational database management
systems to be, well, relational.
Joins allow us to re-construct our separated database tables back
into the relationships that power our applications.
In this article, we'll look at each of the different join types in SQL
and how to use them.
What is a join?
Setting up your database
CROSS JOIN
Setting up our example data (directors and movies)
FULL OUTER JOIN
INNER JOIN
LEFT JOIN / RIGHT JOIN
Filtering using LEFT JOIN
Multiple joins
Joins with extra conditions
The reality about writing queries with joins
(Spoiler alert: we'll cover five different types—but you really only
need to know two of them!)
What is a join?
A join is an operation that combines two rows together into one
row.
These rows are usually from two different tables—but they don't
have to be.
Before we look at how to write the join itself, let's look at what the
result of a join would look like.
Let's take for example a system that stores information about users
and their addresses.
The rows from the table that stores user information might look like
this:
We'll look at how to write these joins soon, but if we joined our
user information to our address information we could get a result
like this:
For these examples we'll be using PostgreSQL, but the queries and
concepts shown here will easily translate to any other modern
database system (like MySQL, SQL Server, etc.).
john=# \c fcc
You are now connected to database "fcc" as user "john".
fcc=#
Note: I've cleaned up the psql output in these examples to make it
easier to read, so don't worry if the output shown here isn't exactly
what you've seen in your terminal.
I encourage you to follow along with these examples and run these
queries for yourself. You will learn and remember far more by
working through these examples rather than just reading them.
CROSS JOIN
The simplest kind of join we can do is a CROSS JOIN or "Cartesian
product."
This join takes each row from one table and joins it with each row
of the other table.
First let's create two very simple tables and insert some data into
them:
This is the simplest type of join we can do—but even in this simple
example we can see the join at work: the two separate rows (one
from letters and one from numbers) have been joined together to
form one row.
While this type of join is often discussed as a mere academic
example, it does have at least one good use case: covering date
ranges.
day
------------
2020-08-19
2020-08-20
2020-08-21
2020-08-22
2020-08-23
2020-08-24
(6 rows)
Going back to our tasks-per-day example, let's create a simple table
to hold the tasks we want to complete and insert a few tasks:
From this query we return the task name and the day, and the result
set looks like this:
name | day
---------------+------------
Brush teeth | 2020-08-19
Brush teeth | 2020-08-20
Brush teeth | 2020-08-21
Brush teeth | 2020-08-22
Brush teeth | 2020-08-23
Brush teeth | 2020-08-24
Eat breakfast | 2020-08-19
Eat breakfast | 2020-08-20
Eat breakfast | 2020-08-21
Eat breakfast | 2020-08-22
...
(24 rows)
Like we expected, we get a row for each task for every day in our
date range.
The CROSS JOIN is the simplest join we can do, but to look at the
next few types we'll need a more-realistic table setup.
Creating directors and movies
To illustrate the following join types, we'll use the example
of movies and movie directors.
In this situation, a movie has one director, but a movie
isn't required to have a director—imagine a new movie being
announced but the choice for director hasn't yet been confirmed.
Our directors table will store the name of each director, and
the movies table will store the name of the movie as well as a
reference to the director of the movie (if it has one).
Let's create those two tables and insert some data into them:
SELECT *
FROM movies
FULL OUTER JOIN directors
ON directors.id = movies.director_id;
Notice the join condition we specified that matches the movie to its
director: ON movies.director_id = directors.id.
Our result set looks like an odd Cartesian product of sorts:
SELECT *
FROM movies
INNER JOIN directors
ON directors.id = movies.director_id;
Our result shows the three movies that have a director:
SELECT *
FROM directors
INNER JOIN movies
ON movies.director_id = directors.id;
id | name | id | name | director_id
----+------------+----+---------+-------------
1 | John Smith | 1 | Movie 1 | 1
1 | John Smith | 2 | Movie 2 | 1
2 | Jane Doe | 3 | Movie 3 | 2
(3 rows)
Since we listed the directors table first in this query and we
selected all columns (SELECT *), we see the directors column data
first and then the columns from movies—but the resulting data is
the same.
This is a useful property of inner joins, but it's not true for all join
types—like our next type.
SELECT *
FROM movies
LEFT JOIN directors
ON directors.id = movies.director_id;
The query follows our same pattern as before—we've just specified
the join as a LEFT JOIN.
In this example, the movies table is the "left" table.
If we write the query on one line it makes this a little easier to see:
The second use case is to return rows from the first table where the
data from the second table isn't present.
The scenario would look like this: find directors who don't belong
to a movie.
To do this we'll start with a LEFT JOIN and our directors table will
be the primary or "left" table:
SELECT *
FROM directors
LEFT JOIN movies
ON movies.director_id = directors.id;
For a director that doesn't belong to a movie, the columns from
the movies table are NULL:
id | name | id | name | director_id
----+--------------+------+---------+-------------
1 | John Smith | 1 | Movie 1 | 1
1 | John Smith | 2 | Movie 2 | 1
2 | Jane Doe | 3 | Movie 3 | 2
5 | Bree Jensen | NULL | NULL | NULL
4 | Bev Scott | NULL | NULL | NULL
3 | Xavier Wills | NULL | NULL | NULL
(6 rows)
In our example, director ID 3, 4, and 5 don't belong to a movie.
To filter our result set just to these rows, we can add a WHERE clause
to only return rows where the movie data is NULL:
SELECT *
FROM directors
LEFT JOIN movies
ON movies.director_id = directors.id
WHERE movies.id IS NULL;
id | name | id | name | director_id
----+--------------+------+------+-------------
5 | Bree Jensen | NULL | NULL | NULL
4 | Bev Scott | NULL | NULL | NULL
3 | Xavier Wills | NULL | NULL | NULL
(3 rows)
And there are our three movie-less directors!
In our result set, we'll notice that we've further narrowed down the
rows that are returned:
But, in reality, the database is not joining three tables together at the
same time. Instead, it will likely join the first two tables together
into one intermediary result, and then join that intermediary result
set to the third table.
(If you're unfamiliar with all of the ways you can filter a SQL
query, check out the previously mentioned article here.)
The reality about writing queries with joins
In reality, I find myself only using joins in three different ways:
INNER JOIN
The first use case is records where the relationship between two
tables does exist. This is fulfilled by the INNER JOIN.
These are situations like finding "movies that have
directors" or "users with posts".
LEFT JOIN
The second use case is records from one table—and if the
relationship exists—records from a second table. This is fulfilled by
the LEFT JOIN.
These are situations like "movies with directors if they have
one" or "users with posts if they have some."
LEFT JOIN exclusion
The third most common use case is our second use case for a LEFT
JOIN: finding records in one table that don't have a relationship in
the second table.
These are situations like "movies without directors" or "users
without posts."
Two very useful join types
I don't think I've ever used a FULL OUTER JOIN or a RIGHT JOIN in a
production application. The use case just doesn't come up often
enough or the query can be written in a clearer way (in the case
of RIGHT JOIN).
I have occasionally used a CROSS JOIN for things like spreading
records across a date range (like we looked at the beginning), but
that scenario also doesn't come up too often.
So, good news! There's really only two types of joins you need to
understand for 99.9% of the use cases you'll come across: INNER
JOIN and LEFT JOIN!
If you liked this post, you can follow me on twitter where I talk
about database things and all other topics related to development.
Thanks for reading!
John
P.S. an extra tip for reading to the end: most database systems will
let you just write JOIN in the place of INNER JOIN—it'll save you a
little extra typing. :)
John Mosesman
A simple web developer who likes helping others learn how to
program.
If you read this far, tweet to the author to show them you
care. Tweet a thanks
Learn to code for free. freeCodeCamp's open source curriculum has
helped more than 40,000 people get jobs as developers. Get started
Countinue reading about
Sql
How to Create and Manipulate SQL Databases with Python
SQL Operators Tutorial – Bitwise, Comparison, Arithmetic, and Logical Operator
Query Examples
SQL Create Table Statement - With Example Syntax
See all 73 posts →