0% found this document useful (0 votes)
19 views

SQL Recap

Uploaded by

Alice Beriozza
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

SQL Recap

Uploaded by

Alice Beriozza
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Select query

26 June 2023
11:46

To retrieve data from a SQL database, we need to write SELECT statements.

Select query for a specific columns


SELECT column, another_column, …
FROM mytable;

Select query for all columns


SELECT*
FROM mytable;

Exercise 1 — Tasks
1. Find the title of each film
SELECT title FROM Movies;

2. Find the director of each film


SELECT director FROM Movies;

3. Find the title and director of each film


SELECT title, director FROM Movies;

4. Find the title and year of each film


SELECT title, year FROM Movies;

5. Find all the information about each film


SELECT *FROM Movies;

Where clause
25 July 2023
12:14

The clause is applied to each row of data by checking specific


column values to determine whether it should be included in the
results or not.
Select query with constraints
SELECT column, another_column, …
FROM mytable
WHERE condition
AND/OR another_condition
AND/OR…;

More complex clauses can be constructed by joining


numerous AND or OR logical keywords (ie. num_wheels >= 4 AND
doors <= 2). And below are some useful operators that you can use
for numerical data (ie. integer or floating point):
Operator Condition SQL Example

=, !=, < <=, >, >= Standard numerical operators col_name != 4

BETWEEN … AND … Number is within range of two values col_name BETWEEN 1.5 AND 10.5
(inclusive)

NOT BETWEEN … AND Number is not within range of two values col_name NOT BETWEEN 1 AND 10
… (inclusive)

IN (…) Number exists in a list col_name IN (2, 4, 6)

NOT IN (…) Number does not exist in a list col_name NOT IN (1, 3, 5)

Exercise 2 — Tasks
a. Find the movie with a row id of 6
SELECT * FROM movies
WHERE id = 6;

b. Find the movies released in the years between 2000 and 2010
SELECT * FROM movies
WHERE year BETWEEN 2000 and 2010;

c. Find the movies not released in the years between 2000 and 2010
SELECT * FROM movies
WHERE year NOT BETWEEN 2000 and 2010;

d. Find the first 5 Pixar movies and their release year


SELECT * FROM movies
WHERE year <=2003;

More constraints
When writing WHERE clauses with columns containing text data, SQL
supports a number of useful operators to do things like case-
insensitive string comparison and wildcard pattern matching.
Operator Condition Example

= Case sensitive exact string comparison (notice the single col_name = "abc"
equals)

!= or <> Case sensitive exact string inequality comparison col_name != "abcd"

LIKE Case insensitive exact string comparison col_name LIKE "ABC"

NOT LIKE Case insensitive exact string inequality comparison col_name NOT LIKE "ABCD"

% Used anywhere in a string to match a sequence of zero or col_name LIKE "%AT%"


more characters (only with LIKE or NOT LIKE) (matches "AT", "ATTIC", "CAT"
or even "BATS")

_ Used anywhere in a string to match a single character (only col_name LIKE "AN_"
with LIKE or NOT LIKE) (matches "AND", but not "AN")

IN (…) String exists in a list col_name IN ("A", "B", "C")

NOT IN (…) String does not exist in a list col_name NOT IN ("D", "E", "F")

All strings must be quoted so that the query parser can distinguish words in
the string from SQL keywords.

Exercise 3 — Tasks
a. Find all the Toy Story movies
SELECT * FROM movies
WHERE title like "%toy story%";

b. Find all the movies directed by John Lasseter


SELECT * FROM movies
WHERE director like "John Lasseter";

c. Find all the movies (and director) not directed by John Lasseter
SELECT title, director FROM movies
WHERE director NOT like "John Lasseter";

d. Find all the WALL-* movies


SELECT title FROM movies
WHERE title like "%Wall-%";
Filtering and Sorting
25 July 2023
12:38

to discard rows that have a duplicate column value by using


the DISTINCT keyword.
Select query with unique results
SELECT DISTINCT column, another_column, …
FROM mytable
WHERE condition(s);

to sort your results by a given column in ascending or descending


order (alpha-numerically) using the ORDER BY clause.
Select query with ordered results
SELECT column, another_column, …
FROM mytable
WHERE condition(s)
ORDER BY column ASC/DESC;

The LIMIT will reduce the number of rows to return, and the
optional OFFSET will specify where to begin counting the number
rows from.
Select query with limited rows
SELECT column, another_column, …
FROM mytable
WHERE condition(s)
ORDER BY column ASC/DESC LIMIT num_limit OFFSET num_offset;

The LIMIT and OFFSET are applied relative to the other parts of a query, they are generally done
last after the other clauses have been applied.

Exercise 4 — Tasks
a. List all directors of Pixar movies (alphabetically), without duplicates
SELECT DISTINCT director FROM movies
ORDER BY director ASC;

b. List the last four Pixar movies released (ordered from most recent to least)
SELECT title, year FROM movies
ORDER BY year DESC
LIMIT 4;

c. List the first five Pixar movies sorted alphabetically


SELECT title FROM movies
ORDER BY title ASC
LIMIT 5;

d. List the next five Pixar movies sorted alphabetically


SELECT title FROM movies
ORDER BY title ASC
LIMIT 5 OFFSET 5;

Review
25 July 2023
12:59

Review 1 — Tasks
1. List all the Canadian cities and their populations
SELECT city, population FROM north_american_cities
WHERE country = "Canada";

2. Order all the cities in the United States by their latitude from north to south
SELECT city, latitude FROM north_american_cities
WHERE country = "United States"
ORDER BY latitude DESC;

3. List all the cities west of Chicago, ordered from west to east. Seeing from the
table that Chicago Longitude is -87.629798
SELECT city, longitude FROM North_american_cities
WHERE longitude < -87.629798
ORDER BY longitude DESC

4. List the two largest cities in Mexico (by population)


SELECT city FROM North_american_cities
WHERE country = "Mexico"
ORDER BY population DESC
LIMIT 2;

5. List the third and fourth largest cities (by population) in the United States and
their population
SELECT city FROM North_american_cities
WHERE country = "United States"
ORDER BY population DESC
LIMIT 2 OFFSET 2;

INNER JOIN
25 July 2023
13:21

Entity data in the real world is often broken down into pieces and stored across multiple orthogonal
tables using a process known as normalization.

Database normalization is useful because it minimizes duplicate data in any single table, and allows
for data in the database to grow independently of each other

Multi-table queries with JOINs

Tables that share information about a single entity need to have


a primary key that identifies that entity uniquely across the
database.
Using the JOIN clause in a query, we can combine row data across two
separate tables using this unique key.

Select query with INNER JOIN on multiple tables


SELECT column, another_table_column, …
FROM mytable
INNER JOIN another_table
ON mytable.id = another_table.id
WHERE condition(s)
ORDER BY column, … ASC/DESC LIMIT num_limit OFFSET num_offset;

The INNER JOIN is a process that matches rows from the first table and the second table which
have the same key (as defined by the ON constraint) to create a result row with the combined
columns from both tables. After the tables are joined, the other clauses are then applied.

the Movie_id column in the Box Office table corresponds with the Id column in
the Movies

Exercise 6 — Tasks
a. Find the domestic and international sales for each movie
SELECT Title, International_sales, Domestic_sales FROM Movies M
INNER JOIN Boxoffice B ON B.Movie_id = M.id;
b. Show the sales numbers for each movie that did better internationally rather than
domestically
SELECT Title, International_sales, Domestic_sales FROM Movies M
INNER JOIN Boxoffice B ON B.Movie_id = M.id
WHERE International_sales > Domestic_sales;

c. List all the movies by their ratings in descending order


SELECT Title, Rating FROM Movies M
INNER JOIN Boxoffice B ON B.Movie_id = M.id
ORDER By Rating DESC;

OUTER JOIN
25 July 2023
14:27

When joining table A to table B, a LEFT JOIN simply includes rows from A regardless of whether a
matching row is found in B. The RIGHT JOIN is the same, but reversed, keeping rows in B
regardless of whether a match is found in A. Finally, a FULL JOIN simply means that rows from
both tables are kept, regardless of whether a matching row exists in the other table.

When using any of these new joins, you will likely have to write additional logic to deal with NULLs
in the result and constraints.

Exercise 7 — Tasks
a. Find the list of all buildings that have employees
SELECT DISTINCT Building_name FROM Buildings B
INNER JOIN Employees E ON B.Building_name = E.Building ;

b. Find the list of all buildings and their capacity


SELECT DISTINCT Building_name, Capacity FROM Buildings B;
c. List all buildings and the distinct employee roles in each building (including empty
buildings)
SELECT DISTINCT Building_name, Role FROM Buildings B
LEFT JOIN Employees E ON B.Building_name = E.Building;
NULL
25 July 2023
14:51

An alternative to NULL values in your database is to have data-type


appropriate default values, like 0 for numerical data, empty
strings for text data, etc. But if your database needs to store
incomplete data, then NULL values can be appropriate if the default
values will skew later analysis (for example, when taking averages
of numerical data).
When it's not possible to avoid NULL values, you can test a column
for NULL values in a WHERE clause by using either the IS NULL or IS NOT
NULL constraint.

Select query with constraints on NULL values


SELECTcolumn, another_column, …
FROMmytable
WHEREcolumn IS/ISNOTNULL
AND/ORanother_condition
AND/OR…;

Exercise 8 — Tasks
1. Find the name and role of all employees who have not been assigned to a building
SELECT Name, Role FROM employees
WHERE Building IS NULL;

2. Find the names of the buildings that hold no employees


SELECT DISTINCT building_name
FROM buildings
LEFT JOIN employees
ON building_name = building
WHERE role IS NULL;

Expressions
26 July 2023
09:47
Expressions can use mathematical and string functions along with
basic arithmetic to transform values when the query is executed.
Example query with expressions
SELECT particle_speed / 2.0 AS half_particle_speed
FROM physics_data
WHERE ABS (particle_position) * 10.0 > 500;

Each database has its own supported set of mathematical, string,


and date functions that can be used in a query, which you can find
in their own respective docs.
Select query with expression aliases
SELECT col_expression AS expr_description, …
FROM mytable;

Example query with both column and table name aliases


SELECT column AS better_column_name, …
FROM a_long_widgets_table_name AS mywidgets
INNER JOIN widget_sales
ON mywidgets.id = widget_sales.widget_id;

Exercise 9 — Tasks
1. List all movies and their combined sales in millions of dollars
SELECT Title, (Domestic_sales + International_sales)/1000000 as Combined_sales
FROM Movies
INNER JOIN Boxoffice
ON Movies.id = Boxoffice.movie_id;

2. List all movies and their ratings in percent


SELECT title, rating * 10 AS rating_percent
FROM Movies
JOIN Boxoffice
ON Movies.id = Boxoffice.movie_id;

3. List all movies that were released on even number years


SELECT title, year
FROM movies
WHERE year % 2 = 0;

Common aggregate functions


Function Description

COUNT(*), COUNT(column A common function used to counts the number of rows in the group if no
) column name is specified. Otherwise, count the number of rows in the group
with non-NULL values in the specified column.

MIN(column) Finds the smallest numerical value in the specified column for all rows in the
group.
MAX(column) Finds the largest numerical value in the specified column for all rows in the
group.

AVG(column) Finds the average numerical value in the specified column for all rows in the
group.

SUM(column) Finds the sum of all numerical values in the specified column for the rows in
the group.

Select query with aggregate functions over all rows


SELECT AGG_FUNC (column_or_expression) AS aggregate_description, …
FROM mytable
WHERE constraint_expression;

Without a specified grouping, each aggregate function is going to


run on the whole set of result rows and return a single value. And
like normal expressions, giving your aggregate functions an alias
ensures that the results will be easier to read and process

Grouped aggregate functions


In addition to aggregating across all the rows, you can instead apply
the aggregate functions to individual groups of data within that
group (ie. box office sales for Comedies vs Action movies).
This would then create as many results as there are unique groups
defined as by the GROUP BY clause.
Select query with aggregate functions over groups
SELECT AGG_FUNC(column_or_expression) AS aggregate_description, …
FROM mytable
WHERE constraint_expression
GROUP By column;

The GROUP BY clause works by grouping rows that have the same
value in the column specified.

Exercise 10 — Tasks
1. Find the longest time that an employee has been at the studio
SELECT MAX(Years_employed) FROM employees;

2. For each role, find the average number of years employed by employees in that
role
SELECT Role, AVG(Years_employed) FROM employees
GROUP BY Role;

3. Find the total number of employee years worked in each building


SELECT Building, SUM (Years_employed) from Employees
GROUP BY Building;

Aggregates
26 July 2023
10:46

The GROUP BY clause is executed after the WHERE clause (which filters
the rows which are to be grouped).
An additional HAVING clause is used specifically with the GROUP
BY clause to allow us to filter grouped rows from the result set.
Select query with HAVING constraint
SELECT group_by_column, AGG_FUNC (column_expression) AS
aggregate_result_alias, …
FROM mytable
WHERE condition
GROUP BY column
HAVING group_condition;
The HAVING clause constraints are written the same way as
the WHERE clause constraints, and are applied to the grouped rows.
If you aren't using the `GROUP BY` clause, a simple `WHERE` clause
will suffice.

Exercise 11 — Tasks
1. Find the number of Artists in the studio (without a HAVING clause)
SELECT Role, COUNT(Name) FROM employees
WHERE Role = "Artist";

2. Find the number of Employees of each role in the studio


SELECT Role, COUNT(Name) FROM employees
GROUP BY Role;

3. Find the total number of years employed by all Engineers


SELECT Role, SUM(Years_employed) FROM employees
GROUP BY Role
HAVING Role = "Engineer";
Order Of Execution
26 July 2023
10:57

Each query begins with finding the data that we need in a database,
and then filtering that data down into something that can be
processed and understood as quickly as possible.
Complete SELECT query
SELECT DISTINCT column, AGG_FUNC(column_or_expression), …
FROM mytable
JOIN another_table
ON mytable.column = another_table.column
WHERE constraint_expression
GROUP BY column
HAVING constraint_expression
ORDERBY column ASC/DESC LIMIT count OFFSET count;

FROM and JOINs


The FROM clause, and subsequent JOINs are first executed to
determine the total working set of data that is being queried.
WHERE
WHERE constraints are applied to the individual rows, and rows that
do not satisfy the constraint are discarded. Each of the constraints
can only access columns directly from the tables requested in
the FROM clause. Aliases are not accessible from this step in most
databases.
GROUP BY
The remaining rows after the WHERE constraints are applied are then
grouped based on common values in the column specified in
the GROUP BY clause.
HAVING
If the query has a GROUP BY clause, then the constraints in
the HAVING clause are then applied to the grouped rows, discard the
grouped rows that don't satisfy the constraint. Like
the WHERE clause, aliases are also not accessible from this step in
most databases.
SELECT
Any expressions in the SELECT part of the query are finally computed.
DISTINCT
Of the remaining rows, rows with duplicate values in the column
marked as DISTINCT will be discarded.
ORDER BY
Since all the expressions in the SELECT part of the query have been
computed, you can reference aliases in this clause.
LIMIT / OFFSET
The rows that fall outside the range specified by
the LIMIT and OFFSET are discarded.

Exercise 12 — Tasks
1. Find the number of movies each director has directed
SELECT Director, COUNT(Title) FROM movies
GROUP BY Director;

2. Find the total domestic and international sales that can be attributed to each
director
SELECT Director, SUM(Domestic_sales + International_sales) AS Total_sales FROM
Movies M
INNER JOIN Boxoffice B
ON M.Id = B.Movie_Id
GROUP BY Director;

Inserting Rows
26 July 2023
11:40

A database is a two-dimensional set of rows and columns, with the


columns being the properties and the rows being instances of the
entity in the table.
In SQL, the database schema is what describes the structure of each
table, and the datatypes that each column of the table can contain.
Inserting new data
When inserting data into a database, we need to use
an INSERT statement, which declares which table to write into, the
columns of data that we are filling, and one or more rows of data to
insert.
In general, each row of data you insert should contain values for every
corresponding column in the table. You can insert multiple rows at a
time by just listing them sequentially.
Insert statement with values for all columns
INSERT INTO mytable
VALUES (value_or_expr, another_value_or_expr, …),
(value_or_expr_2, another_value_or_expr_2, …),
…;
If you have incomplete data and the table contains columns that
support default values, you can insert rows with only the columns of
data you have by specifying them explicitly.
Insert statement with specific columns
INSERT INTO mytable
(column, another_column, …)
VALUES(value_or_expr, another_value_or_expr, …),
(value_or_expr_2, another_value_or_expr_2, …),
…;
In these cases, the number of values need to match the number of
columns specified.
In addition, you can use mathematical and string expressions with the
values that you are inserting.
Example Insert statement with expressions
INSERT INTO boxoffice
(movie_id, rating, sales_in_millions)
VALUES(1, 9.9, 283742034/ 1000000);

Exercise 13 — Tasks
a. Add the studio's new production, Toy Story 4 to the list of movies (you can use
any director)
INSERT INTO movies VALUES (4, "Toy Story 4", "El Directore", 2015, 90);

b. Toy Story 4 has been released to critical acclaim! It had a rating of 8.7, and
made 340 million domestically and 270 million internationally. Add the
record to the BoxOffice table.
INSERT INTO boxoffice VALUES (4, 8.7, 340000000, 270000000);

Updating Rows
26 July 2023
11:55
The UPDATE statement, requires you to specify exactly which table,
columns, and rows to update. In addition, the data you are updating
has to match the data type of the columns in the table schema.
Update statement with values
UPDATE mytable
SET column = value_or_expr,
other_column = another_value_or_expr,

WHEREcondition;

The statement works by taking multiple column/value pairs, and


applying those changes to each and every row that satisfies the
constraint in the WHERE clause.
Leaving out the WHERE clause will cause the update to apply
to all rows.
One helpful tip is to always write the constraint first and test it in
a SELECT query to make sure you are updating the right rows, and
only then writing the column/value pairs to update.

Exercise 14 — Tasks
1. The director for A Bug's Life is incorrect, it was actually directed by John
Lasseter
UPDATE Movies
SET Director = "John Lasseter"
WHERE Id = 2;

2. The year that Toy Story 2 was released is incorrect, it was actually released
in 1999
UPDATE Movies
SET Year = 1999
WHERE id = 3;

3. Both the title and director for Toy Story 8 is incorrect! The title should be "Toy
Story 3" and it was directed by Lee Unkrich
UPDATE movies
SET title = "Toy Story 3", director = "Lee Unkrich"
WHERE id = 11;

Deleting Rows
27 July 2023
14:23
When you need to delete data from a table in the database, you can
use a DELETE statement, which describes the table to act on, and the
rows of the table to delete through the WHERE clause.
Delete statement with condition
DELETE FROM mytable
WHERE condition;
If you decide to leave out the WHERE constraint, then all rows are
removed, which is a quick and easy way to clear out a table
completely (if intentional).
It is downright easy to irrevocably remove data, so always
read your DELETE statements twice and execute once.

Exercise 15 — Tasks
1. This database is getting too big, lets remove all movies that were
released before 2005.
DELETE FROM movies
WHERE Year < 2005;

2. Andrew Stanton has also left the studio, so please remove all movies directed by
him.
DELETE FROM movies
WHERE Director = "Andrew Stanton";

Creating Tables
27 July 2023
14:28

When you have new entities and relationships to store in your


database, you can create a new database table using the CREATE
TABLE statement.
Create table statement w/ optional table constraint and default value
CREATE TABLE IF NOT EXISTS mytable (
column DataType Table Constraint DEFAULT default_value,
another_column DataType TableConstraint DEFAULT default_value,

);
The structure of the new table is defined by its table schema, which
defines a series of columns.
Each column has a name, the type of data allowed in that column,
an optional table constraint on values being inserted, and an optional
default value.
If there already exists a table with the same name, the SQL
implementation will usually throw an error, so to suppress the error
and skip creating a table if one exists, you can use the IF NOT
EXISTS clause.

Table data types


Different databases support different data types, but the common
types support numeric, string, and other miscellaneous things like
dates, booleans, or even binary data.
Data type Description

INTEGER, BOOLEAN The integer datatypes can store whole integer values like the count of a
number or an age. In some implementations, the boolean value is just
represented as an integer value of just 0 or 1.

FLOAT, DOUBLE, REAL The floating point datatypes can store more precise numerical data like
measurements or fractional values. Different types can be used depending
on the floating point precision required for that value.

CHARACTER(num_ch The text based datatypes can store strings and text in all sorts of locales. The
ars), distinction between the various types generally amount to underlaying
VARCHAR(num_chars efficiency of the database when working with these columns.
),
TEXT Both the CHARACTER and VARCHAR (variable character) types are specified
with the max number of characters that they can store (longer values may be
truncated), so can be more efficient to store and query with big tables.

DATE, DATETIME SQL can also store date and time stamps to keep track of time series and
event data. They can be tricky to work with especially when manipulating
data across timezones.

BLOB Finally, SQL can store binary data in blobs right in the database. These values
are often opaque to the database, so you usually have to store them with
the right metadata to requery them.

Table constraints
Each column can have additional table constraints on it which limit
what values can be inserted into that column.
Constraint Description

PRIMARY KEY This means that the values in this column are unique, and each value can be used to
identify a single row in this table.

AUTOINCREME For integer values, this means that the value is automatically filled in and
NT incremented with each row insertion. Not supported in all databases.

UNIQUE This means that the values in this column have to be unique, so you can't insert
another row with the same value in this column as another row in the table. Differs
from the `PRIMARY KEY` in that it doesn't have to be a key for a row in the table.
NOT NULL This means that the inserted value cannot be `NULL`.

CHECK This allows you to run a more complex expression to test whether the values
(expression) inserted are valid. For example, you can check that values are positive, or greater
than a specific size, or start with a certain prefix, etc.

FOREIGN KEY This is a consistency check which ensures that each value in this column corresponds
to another value in a column in another table.

For example, if there are two tables, one listing all Employees by ID, and another
listing their payroll information, the `FOREIGN KEY` can ensure that every row in the
payroll table corresponds to a valid employee in the master Employee list.

An example
Movies table schema
CREATETABLEmovies (
id INTEGER PRIMARYKEY,
title TEXT,
director TEXT,
year INTEGER,
length_minutes INTEGER);

Exercise 16 — Tasks
a. Create a new table named Database with the following columns:
– Name A string (text) describing the name of the database
– Version A number (floating point) of the latest version of this database
– Download_count An integer count of the number of times this database was
downloaded
This table has no constraints.

CREATE TABLE Database (


Name TEXT,
Version FLOAT,
Download_count INTEGER
);

Altering Tables
27 July 2023
14:38

As your data changes over time, SQL provides a way for you to
update your corresponding tables and database schemas by using
the ALTER TABLE statement to add, remove, or modify columns and
table constraints.
Adding columns
You need to specify the data type of the column along with any
potential table constraints and default values to be applied to both
existing and new rows. In some databases like MySQL, you can even
specify where to insert the new column using
the FIRST or AFTER clauses.
Altering table to add new column(s)
ALTER TABLE mytable
ADD column DataType Optional TableConstraint
DEFAULTdefault_value;

Removing columns
Dropping columns is as easy as specifying the column to drop,
however, some databases (including SQLite) don't support this
feature. Instead you may have to create a new table and migrate
the data over.
Altering table to remove column(s)
ALTER TABLE mytable
DROP column_to_be_deleted;

Renaming the table


If you need to rename the table itself, you can also do that using
the RENAME TO clause of the statement.
Altering table name
ALTER TABLE mytable
RENAME TO new_table_name;

Exercise 17 — Tasks
1. Add a column named Aspect_ratio with a FLOAT data type to store the aspect-
ratio each movie was released in.
ALTER TABLE movies
ADD Aspect_ratio FLOAT;

2. Add another column named Language with a TEXT data type to store the
language that the movie was released in. Ensure that the default for this language
is English.
ALTER TABLE movies
ADD Language TEXT
Default "English";

Dropping Tables
27 July 2023
14:45
In some rare cases, you may want to remove an entire table
including all of its data and metadata, and to do so, you can use
the DROP TABLE statement, which differs from the DELETE statement
in that it also removes the table schema from the database entirely.
Drop table statement
DROP TABLE IF EXISTS mytable;

Like the CREATE TABLE statement, the database may throw an error if
the specified table does not exist, and to suppress that error, you
can use the IF EXISTS clause.
In addition, if you have another table that is dependent on columns
in table you are removing (for example, with a FOREIGN
KEY dependency) then you will have to either update all dependent
tables first to remove the dependent rows or to remove those tables
entirely.

1. Drop the BoxOffice table

Subqueries
27 July 2023
15:22

Example: General subquery


From a list of all Sales Associates, with data on the revenue that each Associate brings in,
and their individual salary, you now want to find out which of your Associates are costing
the company more than the average revenue brought per Associate.

First, you would need to calculate the average revenue all the Associates are generating:
SELECT AVG(revenue_generated)
FROM sales_associates;

And then using that result, we can then compare the costs of each of the Associates
against that value. To use it as a subquery, we can just write it straight into
the WHERE clause of the query:

SELECT*
FROM sales_associates
WHERE salary >
(SELECTAVG(revenue_generated)
FROMsales_associates);
As the constraint is executed, each Associate's salary will be tested against the value
queried from the inner subquery.
A subquery can be referenced anywhere a normal table can be
referenced. Inside a FROM clause, you can JOIN subqueries with other
tables, inside a WHERE or HAVING constraint, you can test expressions
against the results of the subquery, and even in expressions in
the SELECT clause, which allow you to return data directly from the
subquery. They are generally executed in the same logical order as
the part of the query that they appear in.
Because subqueries can be nested, each subquery must be fully
enclosed in parentheses in order to establish proper hierarchy.
Subqueries can otherwise reference any tables in the database, and
make use of the constructs of a normal query (though some
implementations don't allow subqueries to use LIMIT or OFFSET).
Correlated subqueries

A more powerful type of subquery is the correlated subquery in


which the inner query references, and is dependent on, a column or
alias from the outer query. Unlike the subqueries above, each of
these inner queries need to be run for each of the rows in the outer
query, since the inner query is dependent on the current outer query
row.
Example: Correlated subquery
Instead of the list of just Sales Associates above, imagine if you have a general list of
Employees, their departments (engineering, sales, etc.), revenue, and salary. This time,
you are now looking across the company to find the employees who perform worse than
average in their department.
For each employee, you would need to calculate their cost relative to the average
revenue generated by all people in their department. To take the average for the
department, the subquery will need to know what department each employee is in:
SELECT*
FROMemployees
WHEREsalary >
(SELECTAVG(revenue_generated)
FROMemployees ASdept_employees
WHEREdept_employees.department = employees.department);
These kinds of complex queries can be powerful, but also difficult to
read and understand, so you should take care using them. If
possible, try and give meaningful aliases to the temporary values
and tables. In addition, correlated subqueries can be difficult to
optimize, so performance characteristics may vary across different
databases.
Existence tests

When we introduced WHERE constraints in Lesson 2: Queries with


constraints, the IN operator was used to test whether the column
value in the current row existed in a fixed list of values. In complex
queries, this can be extended using subqueries to test whether a
column value exists in a dynamic list of values.
Select query with subquery constraint
SELECT*, …
FROMmytable
WHEREcolumn
IN/NOTIN (SELECTanother_column
FROManother_table);
When doing this, notice that the inner subquery must select for a
column value or expression to produce a list that the outer column
value can be tested against. This type of constraint is powerful when
the constraints are based on current data.

From <https://sqlbolt.com/topic/subqueries>

Unions, Intersections & Exceptions


28 July 2023
13:49

When working with multiple tables, the UNION and UNION ALL operator allows you to
append the results of one query to another assuming that they have the same column
count, order and data type. If you use the UNION without the ALL, duplicate rows
between the tables will be removed from the result.
Select query with set operators
SELECT column, another_column
FROM mytable
UNION/ UNIONALL/ INTERSECT/ EXCEPT SELECT other_column,
yet_another_column
FROM another_table
ORDER BY column DESC LIMIT n;

The UNION happens before the ORDER BY and LIMIT. It's not common to use UNIONs,
but if you have data in different tables that can't be joined and processed, it can be an
alternative to making multiple queries on the database.
Similar to the UNION, the INTERSECT operator will ensure that only rows that are
identical in both result sets are returned, and the EXCEPT operator will ensure that only
rows in the first result set that aren't in the second are returned. This means that
the EXCEPT operator is query order-sensitive, like the LEFT JOIN and RIGHT JOIN.
Both INTERSECT and EXCEPT also discard duplicate rows after their respective
operations, though some databases also support INTERSECT ALL and EXCEPT ALL to
allow duplicates to be retained and returned.

You might also like