SQL Database
SQL Database
1
Section 7. Subquery
This section deals with the subquery which is a query nested within another statement such as
SELECT, INSERT, UPDATE or DELETE statement.
Subquery – explain the subquery concept and show you how to use various subquery type to
select data.
Correlated subquery – introduce you to the correlated subquery concept.
EXISTS – test for the existence of rows returned by a subquery.
ANY – compare a value with a single-column set of values returned by a subquery and return
TRUE the value matches any value in the set.
ALL – compare a value with a single-column set of values returned by a subquery and return
TRUE the value matches all values in the set.
2
DROP DATABASE – learn how to delete existing databases.
CREATE SCHEMA – describe how to create a new schema in a database.
ALTER SCHEMA – show how to transfer a securable from a schema to another within the same
database.
DROP SCHEMA – learn how to delete a schema from a database.
CREATE TABLE – walk you through the steps of creating a new table in a specific schema of a
database.
Identity column – learn how to use the IDENTITY property to create the identity column for a
table.
Sequence – describe how to generate a sequence of numeric values based on a specification.
ALTER TABLE ADD column – show you how to add one or more columns to an existing table
ALTER TABLE ALTER COLUMN – show you how to change the definition of existing
columns in a table.
ALTER TABLE DROP COLUMN – learn how to drop one or more columns from a table.
Computed columns – how to use the computed columns to resue the calculation logic in multiple
queries.
DROP TABLE – show you how to delete tables from the database.
TRUNCATE TABLE – delete all data from a table faster and more efficiently.
SELECT INTO – learn how to create a table and insert data from a query into it.
Rename a table – walk you through the process of renaming a table to a new one.
Temporary tables – introduce you to the temporary tables for storing temporarily immediate data
in stored procedures or database session.
Synonym – explain you the synonym and show you how to create synonyms for database
objects.
3
Primary key – explain you to the primary key concept and show you how to use the primary key
constraint to manage a primary key of a table.
Foreign key – introduce you to the foreign key concept and show you use the FOREIGN
KEY constraint to enforce the link of data in two tables.
NOT NULL constraint – show you how to ensure a column not to accept NULL.
UNIQUE constraint – ensure that data contained in a column, or a group of columns, is unique
among rows in a table.
CHECK constraint – walk you through the process of adding logic for checking data before
storing them in tables.
Advanced SQL
This section covers the advanced SQL Server topics including views, indexes, stored procedures,
user-defined functions, and triggers.
4
Aggregate function
An aggregate function performs a calculation one or more values and returns a single value. The
aggregate function is often used with the GROUP BY clause and HAVING clause of
the SELECT statement.
Aggregate function Description
AVG The AVG() aggregate function calculates the average of non-NULL values in a set.
CHECKSUM_AG The CHECKSUM_AGG() function calculates a checksum value based on a group of rows.
G
COUNT The COUNT() aggregate function returns the number of rows in a group, including rows
with NULL values.
COUNT_BIG The COUNT_BIG() aggregate function returns the number of rows (with BIGINT data
type) in a group, including rows with NULL values.
MAX The MAX() aggregate function returns the highest value (maximum) in a set of non-NULL
values.
MIN The MIN() aggregate function returns the lowest value (minimum) in a set of non-NULL
values.
STDEV The STDEV() function returns the statistical standard deviation of all values provided in the
expression based on a sample of the data population.
STDEVP The STDEVP() function also returns the standard deviation for all values in the provided
expression, but does so based on the entire data population.
SUM The SUM() aggregate function returns the summation of all non-NULL values a set.
VAR The VAR() function returns the statistical variance of values in an expression based on a
sample of the specified population.
VARP The VARP() function returns the statistical variance of values in an expression but does
so based on the entire data population.
SQL IN Clause
SELECT column1, column2....columnN
FROM table_name
WHERE column_name IN (val-1, val-2,...val-N);
6
HAVING (arithematic function condition);
7
SQL CREATE DATABASE Statement
CREATE DATABASE database_name;
8
value.
TODATETIMEOFFSE Transforms a DATETIME2 value into a DATETIMEOFFSET value.
T
Modifying dates
Function Description
DATEFROMPARTS Return a DATE value from the year, month, and day.
DATETIME2FROMPARTS Returns a DATETIME2 value from the date and time arguments
DATETIMEOFFSETFROMPARTS Returns a DATETIMEOFFSET value from the date and time arguments
TIMEFROMPARTS Returns a TIME value from the time parts with the precisions
Constructing date and time from their parts
Validating date and time values
Function Description
ISDATE Check if a value is a valid date, time, or datetime value
STRING FUNCTIONS
Function Description
ASCII Return the ASCII code value of a character
CHAR Convert an ASCII value to a character
CHARINDEX Search for a substring inside a string starting from a specified location and return the positi
substring.
CONCAT Join two or more strings into one string
CONCAT_WS Concatenate multiple strings with a separator into a single string
DIFFERENCE Compare the SOUNDEX() values of two strings
FORMAT Return a value formatted with the specified format and optional culture
LEFT Extract a given a number of characters from a character string starting from the left
LEN Return a number of characters of a character string
LOWER Convert a string to lowercase
LTRIM Return a new string from a specified string after removing all leading blanks
NCHAR Return the Unicode character with the specified integer code, as defined by the Unicode st
PATINDEX Returns the starting position of the first occurrence of a pattern in a string.
QUOTENAME Returns a Unicode string with the delimiters added to make the input string a valid delimit
REPLACE Replace all occurrences of a substring, within a string, with another substring
REPLICATE Return a string repeated a specified number of times
REVERSE Return the reverse order of a character string
RIGHT Extract a given a number of characters from a character string starting from the right
RTRIM Return a new string from a specified string after removing all trailing blanks
SOUNDEX Return a four-character (SOUNDEX) code of a string based on how it is spoken
SPACE Returns a string of repeated spaces.
STR Returns character data converted from numeric data.
STRING_AGG Concatenate rows of strings with a specified separator into a new string
STRING_ESCAPE Escapes special characters in a string and returns a new string with escaped characters
9
Function Description
STRING_SPLIT A table-valued function that splits a string into rows of substrings based on a specified sep
STUFF Delete a part of a string and then insert another substring into the string starting at a specifi
SUBSTRING Extract a substring within a string starting from a specified location with a specified length
TRANSLATE Replace several single-characters, one-to-one translation in one operation.
TRIM Return a new string from a specified string after removing all leading and trailing blanks
UNICODE Returns the integer value, as defined by the Unicode standard, of a character.
SQL Server Data Types
String Data Types
Data type Description Max size
10
decimal(p,s) Fixed precision and scale numbers.
Allows numbers from -10^38 +1 to 10^38 –1.
The p parameter indicates the maximum total number of digits that can be stored (both to the left and to
of the decimal point). p must be a value from 1 to 38. Default is 18.
The s parameter indicates the maximum number of digits stored to the right of the decimal point. s m
value from 0 to p. Default value is 0
float(n) Floating precision number data from -1.79E + 308 to 1.79E + 308.
The n parameter indicates whether the field should hold 4 or 8 bytes. float(24) holds a 4-byte field and f
holds an 8-byte field. Default value of n is 53.
datetime From January 1, 1753 to December 31, 9999 with an accuracy of 3.33 milliseconds
datetime2 From January 1, 0001 to December 31, 9999 with an accuracy of 100 nanoseconds
date Store a date only. From January 1, 0001 to December 31, 9999
datetimeoffset The same as datetime2 with the addition of a time zone offset
timestamp Stores a unique number that gets updated every time a row gets created or modif
timestamp value is based upon an internal clock and does not correspond to real tim
table may have only one timestamp variable
Date and Time Data Types
SQL Keywords
11
Keyword Description
12
ADD Adds a column in an existing table
ALL Returns true if all of the subquery values meet the condition
ANY Returns true if any of the subquery values meet the condition
CHECK A constraint that limits the value that can be placed in a column
CREATE UNIQUE INDEX Creates a unique index on a table (no duplicate values)
CREATE VIEW Creates a view based on the result set of a SELECT statement
13
DATABASE Creates or deletes an SQL database
FOREIGN KEY A constraint that is a key used to link two tables together
FULL OUTER JOIN Returns all rows when there is a match in either left table or right table
INNER JOIN Returns rows that have matching values in both tables
14
INSERT INTO SELECT Copies data from one table into another table
LEFT JOIN Returns all rows from the left table, and the matching rows
from the right table
NOT NULL A constraint that enforces a column to not accept NULL values
OUTER JOIN Returns all rows when there is a match in either left table or
right table
PRIMARY KEY A constraint that uniquely identifies each record in a database table
RIGHT JOIN Returns all rows from the right table, and the matching rows from the l
SELECT INTO Copies data from one table into a new table
SELECT TOP Specifies the number of records to return in the result set
15
TOP Specifies the number of records to return in the result set
TRUNCATE TABLE Deletes the data inside a table, but not the table itself
UNION ALL Combines the result set of two or more SELECT statements
(allows duplicate values)
UNIQUE A constraint that ensures that all values in a column are unique
Database:
Database tables are objects that stores all the data in a database.
In a table, data is logically organized in a row-and-column format which is similar to
a spreadsheet.
In a table, each row represents a unique record and each column represents a field in
the record. For example, the customers table contains customer data such as customer
identification number, first name, last name, phone, email, and address information as
shown below:
Schemas
Logically groups tables and other database objects.
Example: we have two schemas: sales and production.
The sales schema groups all the sales related tables while
the production schema groups all the production related tables.
Select
To query data from a table,
We use the SELECT statement
The following example returns the city in California which has more than 10 customers:
SELECT
16
city,
COUNT (*)
FROM
sales.customers
WHERE
state = 'CA'
GROUP BY
city
HAVING
COUNT (*) > 10
ORDER BY
city;
ORDER BY clause
The SELECT statement to query data from a table, the order of rows in the result set is
not guaranteed. SQL Server can return a result set with an unspecified order of rows.
The only way for you to guarantee that the rows in the result set are sorted is to use the
ORDER BY clause.
The ORDER BY keyword is used to sort the result-set in ascending or descending
order.
The ORDER BY keyword sorts the records in ascending order by default. To sort the
records in descending order, use the DESC keyword.
The following statement sorts the customers by the city in descending order and the sort
the sorted result set by the first name in ascending order.
SELECT
city,
first_name,
last_name
FROM
sales.customers
ORDER BY
city DESC,
first_name ASC;
OFFSET FETCH
OFFSET FETCH clauses to limit the number of rows returned by a query.
The OFFSET and FETCH clauses are the options of the ORDER BY clause.
The OFFSET clause specifies the number of rows to skip before starting to return rows
from the query. The offset_row_count can be a constant, variable, or parameter that
is greater or equal to zero.
The FETCH clause specifies the number of rows to return after the OFFSET clause has
been processed.
The offset_row_count can a constant, variable or scalar that is greater or equal to one.
17
The OFFSET clause is mandatory while the FETCH clause is optional. Also,
the FIRST and NEXT are synonyms respectively so you can use them interchangeably.
Similarly, you can use the FIRST and NEXT interchangeably.
To skip the first 10 products and return the rest, you use the OFFSET clause as shown in the
following statement
SELECT
product_name,
list_price
FROM
production.products
ORDER BY
list_price,
product_name
OFFSET 10 ROWS;
To skip the first 10 products and select the next 10 products, you use
both OFFSET and FETCH clauses as follows:
SELECT
product_name,
list_price
FROM
production.products
ORDER BY
list_price,
product_name
OFFSET 10 ROWS
FETCH NEXT 10 ROWS ONLY;
SELECT
product_name,
list_price
FROM
production.products
ORDER BY
list_price DESC,
product_name
OFFSET 0 ROWS
FETCH FIRST 10 ROWS ONLY;
TOP
18
The SELECT TOP clause is used to specify the number of records to return.
or percentage of rows returned in a query result set.
Because the order of rows stored in a table is unspecified, the SELECT TOP statement is
always used in conjunction with the ORDER BY clause
DISTINCT
The SELECT DISTINCT statement is used to return only distinct (different) values.
Inside a table, a column often contains many duplicate values; and sometimes you only
want to list the different (distinct) values.
19
DISTINCT vs. GROUP BY: The following statement uses the GROUP BY clause to return
distinct cities together with state and zip code from the sales.customers table:
SELECT
city,
state,
zip_code
FROM
sales.customers
GROUP BY
city, state, zip_code
ORDER BY
city, state, zip_code
SELECT
DISTINCT
city,
state,
zip_code
FROM
sales.customers;
Where Clause
The WHERE clause is used to filter records.
It is used to extract only those records that fulfill a specified condition.
Null Value
A field with a NULL value is a field with no value.
If a field in a table is optional, it is possible to insert a new record or update a record
without adding a value to this field. Then, the field will be saved with a NULL value.
It is not possible to test for NULL values with comparison operators, such as =, <, or
<>.
We will have to use the IS NULL and IS NOT NULL operators instead.
And Operator
The AND is a logical operator that allows you to combine two Boolean expressions.
It returns TRUE only when both expressions evaluate to TRUE.
The boolean_expression is any valid Boolean expression that evaluates
to TRUE, FALSE, and UNKNOWN.
To get the product whose brand id is one or two and list price is larger than 1,000, you
use parentheses as follows:
SELECT
*
FROM
production.products
WHERE
20
(brand_id = 1 OR brand_id = 2)
AND list_price > 1000
ORDER BY
brand_id;
To test whether a value is NULL or not, you always use the IS NULL operator.
SELECT
customer_id,
first_name,
last_name,
phone
FROM
sales.customers
WHERE
phone IS NULL
ORDER BY
first_name,
last_name;
OR operator
The SQL Server OR is a logical operator that allows you to combine two Boolean
expressions. It returns TRUE when either of the conditions evaluates to TRUE.
IN operator
The IN operator allows you to specify multiple values in a WHERE clause.
The IN operator is a shorthand for multiple OR conditions.
It is a logical operator that allows you to test whether a specified value matches any value
in a list.
Negate the IN operator, We use the NOT IN operator
The following query returns a list of product identification numbers of the products located in the
store id one and has the quantity greater than or equal to 30:
SELECT
product_name,
list_price
FROM
production.products
WHERE
product_id IN (
SELECT
product_id
FROM
production.stocks
21
WHERE
store_id = 1 AND quantity >= 30
)
ORDER BY
product_name;
BETWEEN operator
The BETWEEN operator selects values within a given range.
The values can be numbers, text, or dates.
The BETWEEN operator is inclusive: begin and end values are included.
To negate the result of the BETWEEN operator, you use NOT BETWEEN operator
finds the orders that customers placed between January 15, 2017 and January 17, 2017:
SELECT
order_id,
customer_id,
order_date,
order_status
FROM
sales.orders
WHERE
order_date BETWEEN '20170115' AND '20170117'
ORDER BY
order_date;
Notice that to specify a date constant, you use the format ‘YYYYMMDD‘ where YYYY is 4-
digits year e.g., 2017, MM is 2-digits month e.g., 01 and DD is 2-digits day e.g., 15
Like Operator:
The LIKE operator is used in a WHERE clause to search for a specified pattern in a
column.
There are two wildcards often used in conjunction with the LIKE operator:
The percent sign (%) represents zero, one, or multiple characters
The underscore sign (_) represents one, single character
LIKE Operator Description
WHERE CustomerName LIKE 'a%' Finds any values that start with "a"
WHERE CustomerName LIKE '%a' Finds any values that end with "a"
WHERE CustomerName LIKE '%or%' Finds any values that have "or" in any position
WHERE CustomerName LIKE '_r%' Finds any values that have "r" in the second position
WHERE CustomerName LIKE 'a_%' Finds any values that start with "a" and are at
least 2 characters in length
22
WHERE CustomerName LIKE 'a__%' Finds any values that start with "a" and are at least 3 character
WHERE ContactName LIKE 'a%o' Finds any values that start with "a" and ends with "o"
* Represents zero or more characters bl* finds bl, black, blue, and blob
[] Represents any single character within the brackets h[oa]t finds hot and hat, but not hit
! Represents any character not in the brackets h[!oa]t finds hit, but not hot and hat
# Represents any single numeric character 2#5 finds 205, 215, 225, 235, 245, 255, 265, 275, 285, and
SQL Aliases
SQL aliases are used to give a table, or a column in a table, a temporary name.
Aliases are often used to make column names more readable.
An alias only exists for the duration of that query.
An alias is created with the AS keyword.
SQL JOIN
A JOIN clause is used to combine rows from two or more tables, based on a related column
between them.
23
The RIGHT JOIN combines data from two or more tables.
The RIGHT JOIN clause starts selecting data from the right table and matching with the
rows from the left table.
The RIGHT JOIN returns a result set that includes all rows in the right table,
whether or not they have matching rows from the left table.
If a row in the right table does not have any matching rows from the left table, the
column of the left table in the result set will have nulls.
GROUP BY
The GROUP BY statement groups rows that have the same values into summary rows,
like "find the number of customers in each country".
The GROUP BY statement is often used with aggregate functions
(COUNT(), MAX(), MIN(), SUM(), AVG()) to group the result-set by one or more
columns.
For example, the following query returns the number of orders placed by the customer by
year:
SELECT
customer_id,
YEAR (order_date) order_year,
COUNT (order_id) order_placed
FROM
24
sales.orders
WHERE
customer_id IN (1, 2)
GROUP BY
customer_id,
YEAR (order_date)
ORDER BY
customer_id;
Self Join
A self join allows you to join a table to itself.
It is useful for querying hierarchical data or comparing rows within the same table.
A self join uses the inner join or left join clause. Because the query that uses self join
references the same table, the table alias is used to assign different names to the same
table within the query.
EXAMPLE: The staffs table stores the staff information such as id, first name, last name, and
email. It also has a column named manager_id that specifies the direct manager. For
example, Mireya reports to Fabiola because the value in the manager_id of Mireya is Fabiola.
Fabiola has no manager so the manager id column has a NULL.
To get who reports to whom, you use the self join as shown in the following query:
SELECT
e.first_name + ' ' + e.last_name employee,
m.first_name + ' ' + m.last_name manager
FROM
sales.staffs e
INNER JOIN sales.staffs m ON m.staff_id = e.manager_id
ORDER BY
manager;
Having Clause:
The HAVING clause was added to SQL because the WHERE keyword cannot be used
with aggregate functions.
GROUP BY clause summarizes the rows into groups and the HAVING clause
applies one or more conditions to these groups.
Because SQL Server processes the HAVING clause after the GROUP BY clause, you
cannot refer to the aggregate function specified in the select list by using the column
alias.
The following statement uses the HAVING clause to find the customers who placed at least two
orders per year:
SELECT
customer_id,
YEAR (order_date),
COUNT (order_id) order_count
FROM
25
sales.orders
GROUP BY
customer_id,
YEAR (order_date)
HAVING
COUNT (order_id) >= 2
ORDER BY
customer_id;
EXISTS Operator
The EXISTS operator is used to test for the existence of any record in a subquery.
The EXISTS operator returns TRUE if the subquery returns one or more records.
The EXISTS operator is a logical operator that allows you to check whether
a subquery returns any row. The EXISTS operator returns TRUE if
the subquery returns one or more rows.
The following example finds all customers who have placed more than two orders:
SELECT
customer_id,
first_name,
last_name
FROM
sales.customers c
WHERE
EXISTS (
SELECT
COUNT (*)
FROM
sales.orders o
WHERE
customer_id = c.customer_id
GROUP BY
customer_id
HAVING
COUNT (*) > 2
)
ORDER BY
first_name,
last_name;
26
The following query finds the products whose list prices are bigger than the average list price of
products of all brands:
SELECT
product_name,
list_price
FROM
production.products
WHERE
list_price > ALL (
SELECT
AVG (list_price) avg_list_price
FROM
production.products
GROUP BY
brand_id
)
ORDER BY
list_price;
Update Statement:
The UPDATE statement is used to modify the existing records in a table.
UPDATE table_name
SET c1 = v1, c2 = v2, ... cn = vn
27
[WHERE condition]
UPDATE sales.taxes
SET updated_at = GETDATE();
Delete:
DELETE statement to remove one or more rows from a table. It is possible to delete all rows
in a table without deleting the table. This means that the table structure, attributes, and indexes
will be intact:
DELETE FROM table_name;
DELETE [ TOP ( expression ) [ PERCENT ] ]
FROM table_name
[WHERE search_condition];
INSERT
To add one or more rows into a table, you use the INSERT statement. The following
illustrates the most basic form of the INSERT statement:
INSERT INTO table_name (column_list)
VALUES (value_list);
To add multiple rows to a table at once, you use the following form of the INSERT statement:
INSERT INTO table_name (column_list)
VALUES
(value_list_1),
(value_list_2),
...
(value_list_n);
28
Second, to insert the top 10 customers sorted by their first names and last names, you use
the INSERT TOP INTO SELECT statement as follows:
INSERT TOP (10)
INTO sales.addresses (street, city, state, zip_code)
SELECT
street,
city,
state,
zip_code
FROM
sales.customers
ORDER BY
first_name,
last_name;
Grouping Sets:
A grouping set is a group of columns by which you group.
Typically, a single query with an aggregate defines a single grouping set.
For example, the following query defines a grouping set that includes brand and category which
is denoted as (brand, category). The query returns the sales amount grouped by brand and
category:
The GROUPING SETS defines multiple grouping sets in the same query. The following
shows the general syntax of the GROUPING SETS:
SELECT
column1,
column2,
aggregate_function (column3)
FROM
table_name
GROUP BY
GROUPING SETS (
(column1, column2),
29
(column1),
(column2),
()
);
SELECT
brand,
category,
SUM (sales) sales
FROM
sales.sales_summary
GROUP BY
GROUPING SETS (
(brand, category),
(brand),
(category),
()
)
ORDER BY
brand,
category;
CUBE:
The CUBE is a subclause of the GROUP BY clause
that allows you to generate multiple grouping sets.
If you have N dimension columns specified in the CUBE, you will have 2N grouping
sets.
The following illustrates the general syntax of the CUBE:
SELECT
d1,
d2,
d3,
aggregate_function (c4)
FROM
table_name
GROUP BY
CUBE (d1, d2, d3);
30
In this syntax, the CUBE generates all possible grouping sets based on the dimension columns
d1, d2, and d3 that you specify in the CUBE clause. SELECT
brand,
category,
SUM (sales) sales
FROM
sales.sales_summary
GROUP BY
CUBE(brand, category);
ROLLUP
The SQL Server ROLLUP is a subclause of the GROUP BY clause which provides a
shorthand for defining multiple grouping sets.
Unlike the CUBE subclause, ROLLUP does not create all possible grouping sets based
on the dimension columns; the CUBE makes a subset of those.
When generating the grouping sets, ROLLUP assumes a hierarchy among the dimension
columns and only generates grouping sets based on this hierarchy.
The ROLLUP is often used to generate subtotals and totals for reporting purposes.
Let’s consider an example. The following CUBE (d1,d2,d3) defines eight possible grouping sets:
(d1, d2, d3)
(d1, d2)
(d2, d3)
(d1, d3)
(d1)
(d2)
(d3)
()
And the ROLLUP(d1,d2,d3) creates only four grouping sets, assuming the hierarchy d1 > d2 >
d3, as follows:
(d1, d2, d3)
(d1, d2)
(d1)
()
The ROLLUP is commonly used to calculate the aggregates of hierarchical data such as sales by
year > quarter > month.
SELECT
category,
brand,
SUM (sales) sales
FROM
sales.sales_summary
GROUP BY
ROLLUP (category, brand);
31
Subquery
A subquery is a query nested inside another statement such
as SELECT, INSERT, UPDATE, or DELETE.
A subquery is also known as an inner query or inner select while the statement containing
the subquery is called an outer select or outer query:
A subquery can be nested within another subquery. SQL Server supports up to 32 levels
of nesting.
SELECT
product_name,
list_price
FROM
production.products
WHERE
list_price > (
SELECT
AVG (list_price)
FROM
production.products
WHERE
brand_id IN (
SELECT
brand_id
FROM
production.brands
WHERE
brand_name = 'Strider'
OR brand_name = 'Trek'
)
)
ORDER BY
list_price;
32
In the FROM clause
SELECT
order_id,
order_date,
(
SELECT
MAX (list_price)
FROM
sales.order_items i
WHERE
i.order_id = o.order_id
) AS max_list_price
FROM
sales.orders o
order by order_date desc;
Correlated subquery
A correlated subquery is a subquery that uses the values of the outer query.
In other words, it depends on the outer query for its values. Because of this dependency, a
correlated subquery cannot be executed independently as a simple subquery.
Moreover, a correlated subquery is executed repeatedly, once for each row evaluated by
the outer query. The correlated subquery is also known as a repeating subquery.
The following example finds the products whose list price is equal to the highest list price
of the products within the same category:
SELECT
product_name,
list_price,
category_id
FROM
production.products p1
WHERE
list_price IN (
SELECT
MAX (p2.list_price)
FROM
production.products p2
WHERE
p2.category_id = p1.category_id
GROUP BY
p2.category_id
)
ORDER BY
category_id,
product_name;
33
EXISTS
The EXISTS operator is a logical operator that allows you to check whether
a subquery returns any row.
The EXISTS operator returns TRUE if the subquery returns one or more rows.
The following shows the syntax of the SQL Server EXISTS operator:
EXISTS ( subquery)
In this syntax, the subquery is a SELECT statement only. As soon as the subquery returns
rows, the EXISTS operator returns TRUE and stop processing immediately.
Note that even though the subquery returns a NULL value, the EXISTS operator is still
evaluated to TRUE.
SELECT
customer_id,
first_name,
last_name
FROM
sales.customers c
WHERE
EXISTS (
SELECT
COUNT (*)
FROM
sales.orders o
WHERE
customer_id = c.customer_id
GROUP BY
customer_id
HAVING
COUNT (*) > 2
)
ORDER BY
first_name,
last_name;
UNION
SQL Server UNION is one of the set operations that allows you to combine results of
two SELECT statements into a single result set
which includes all the rows that belongs to the SELECT statements in the union.
The following illustrates the syntax of the SQL Server UNION:
query_1
UNION
query_2
The following are requirements for the queries in the syntax above:
The number and the order of the columns must be the same in both queries.
The data types of the corresponding columns must be the same or compatible.
34
The following Venn diagram illustrates how the result set of the T1 table unions with the result
set of the T2 table:
UNION vs. UNION ALL
By default, the UNION operator removes all duplicate rows from the result sets.
You want to retain the duplicate rows, you need to specify the ALL keyword
explicitly as shown below:
query_1
UNION ALL
query_2
UNION vs. JOIN
The join such as INNER JOIN or LEFT JOIN combines columns from two tables while
the UNION combines rows from two queries.
Join appends the result sets horizontally while union appends result set vertically.
The following picture illustrates the main difference between UNION and JOIN:
35
INTERSECT
The SQL Server INTERSECT combines result sets of two or more queries and returns
distinct rows that are output by both queries.
SQL Server INTERSECT:
query_1
INTERSECT
query_2
Similar to the UNION operator, the queries in the syntax above must conform to the
following rules:
Both queries must have the same number and order of columns.
The data type of the corresponding columns must be the same or compatible.
SELECT
city
FROM
sales.customers
INTERSECT
SELECT
city
FROM
sales.stores
ORDER BY
city;
The first query finds all cities of the customers and the second query finds the cities of the stores.
The whole query, which uses INTERSECT, returns the common cities of customers and stores,
which are the cities output by both input queries.
36
EXCEPT
The SQL Server EXCEPT compares the result sets of two queries and returns
the distinct rows from the first query that are not output by the second query. In
other words, the EXCEPT subtracts the result set of a query from another.
The following are the rules for combining the result sets of two queries in the above syntax:
The number and order of columns must be the same in both queries.
The data types of the corresponding columns must be the same or compatible.
The following picture shows the EXCEPT operation of the two result sets T1 and T2:
In this syntax:
First, specify the expression name (expression_name) to which you can refer later in a
query.
Next, specify a list of comma-separated columns after the expression_name.
The number of columns must be the same as the number of columns defined in
the CTE_definition.
Then, use the AS keyword after the expression name or column list if the column list is
specified.
37
After, define a SELECT statement whose result set populates the common table
expression.
Finally, refer to the common table expression in a query (SQL_statement) such
as SELECT, INSERT, UPDATE, DELETE, or MERGE.
We prefer to use common table expressions rather than to use subqueries because
common table expressions are more readable. We also use CTE in the queries that
contain analytic functions (or window functions)
SQL Server CTE examples
Let’s take some examples of using common table expressions.
A) Simple SQL Server CTE example
This query uses a CTE to return the sales amounts by sales staffs in 2018:
SELECT
staff,
sales
FROM
cte_sales_amounts
WHERE
year = 2018;
In this example:
First, we defined cte_sales_amounts as the name of the common table expression. the
CTE returns a result that that consists of three columns staff, year, and sales derived from
the definition query.
Second, we constructed a query that returns the total sales amount by sales staff and year
by querying data from the orders, order_items and staffs tables.
Third, we referred to the CTE in the outer query and select only the rows whose year are
2018.
38
Recursive CTE
A recursive common table expression (CTE) is a CTE that references itself.
By doing so, the CTE repeatedly executes, returns subsets of data, until it returns the
complete result set.
A recursive CTE is useful in querying hierarchical data
such as organization charts where one employee reports to a manager or
multi-level bill of materials when a product consists of many components, and each
component itself also consists of many other components.
39
In this table, a staff reports to zero or one manager. A manager may have zero or more
staffs. The top manager has no manager. The relationship is specified in the values of
the manager_id column. If a staff does not report to any staff (in case of the top
manager), the value in the manager_id is NULL.
This example uses a recursive CTE to get all subordinates of the top manager who does
not have a manager (or the value in the manager_id column is NULL):
WITH cte_org AS (
SELECT
staff_id,
first_name,
manager_id
FROM
sales.staffs
WHERE manager_id IS NULL
UNION ALL
SELECT
e.staff_id,
e.first_name,
e.manager_id
FROM
sales.staffs e
INNER JOIN cte_org o
ON o.staff_id = e.manager_id
)
SELECT * FROM cte_org;
In this example, the anchor member gets the top manager and the recursive query returns
subordinates of the top managers and subordinates of the top manager, and so on.
40
MERGE
Merge Statement introduced in Sql Server 2008.
It allows us to perform Inserts, updates and Deletes in one statement.
This means we no longer have to use multiple statements for performing Insert, update
and Delete.
First, you specify the target table and the source table in the MERGE clause.
Second, the merge_condition determines how the rows from the source table are matched
to the rows from the target table. It is similar to the join condition in the join clause.
Typically, you use the key columns either primary key or unique key for matching.
Third, the merge_condition results in three states: MATCHED, NOT MATCHED,
and NOT MATCHED BY SOURCE.
MATCHED: these are the rows that match the merge condition. In the diagram, they are
shown as blue. For the matching rows, you need to update the rows columns in the target
table with values from the source table.
NOT MATCHED: these are the rows from the source table that does not have any
matching rows in the target table. In the diagram, they are shown as orange. In this case,
you need to add the rows from the source table to the target table. Note that NOT
MATCHED is also known as NOT MATCHED BY TARGET.
NOT MATCHED BY SOURCE: these are the rows in the target table that does not
match any rows in the source table. They are shown as green in the diagram. If you want
to synchronize the target table with the data from the source table, then you will need to
use this match condition to delete rows from the target table.
41
SQL Server MERGE statement example
Suppose we have two table sales.category and sales.category_staging that store the sales by
product category. CREATE TABLE sales.category (
category_id INT PRIMARY KEY,
category_name VARCHAR(255) NOT NULL,
amount DECIMAL(10 , 2 )
);
We can remove the Delete condition, we do not want to remove the rows from the target table.
42
PIVOT operator
SQL Server PIVOT operator rotates a table-valued expression.
It turns the unique values in one column into multiple columns in the output and
performs aggregations on any remaining column values.
The following query finds the number of products for each product category:
SELECT
category_name,
COUNT(product_id) product_count
FROM
production.products p
INNER JOIN production.categories c
ON c.category_id = p.category_id
GROUP BY
category_name;
Our goal is to turn the category names from the first column of the output into multiple columns
and count the number of products for each category name as the following picture:
In addition, we can add the model year to group the category by model year as shown in the
following output:
43
SQL Server PIVOT operator rotates a table-valued expression. It turns the unique values in one
column into multiple columns in the output and performs aggregations on any remaining column
values.
You follow these steps to make a query a pivot table:
SELECT * FROM (
SELECT
category_name,
product_id
FROM
production.products p
INNER JOIN production.categories c
ON c.category_id = p.category_id
)t
Code language: SQL (Structured Query Language) (sql)
Third, apply the PIVOT operator:
SELECT * FROM
(
SELECT
category_name,
product_id
FROM
production.products p
INNER JOIN production.categories c
ON c.category_id = p.category_id
)t
PIVOT(
COUNT(product_id)
44
FOR category_name IN (
[Children Bicycles],
[Comfort Bicycles],
[Cruisers Bicycles],
[Cyclocross Bicycles],
[Electric Bikes],
[Mountain Bikes],
[Road Bikes])
) AS pivot_table;
Code language: SQL (Structured Query Language) (sql)
This query generates the following output:
Now, any additional column which you add to the select list of the query that returns the base
data will automatically form row groups in the pivot table. For example, you can add the model
year column to the above query:
SELECT * FROM
(
SELECT
category_name,
product_id,
model_year
FROM
production.products p
INNER JOIN production.categories c
ON c.category_id = p.category_id
)t
PIVOT(
COUNT(product_id)
FOR category_name IN (
[Children Bicycles],
[Comfort Bicycles],
[Cruisers Bicycles],
[Cyclocross Bicycles],
[Electric Bikes],
[Mountain Bikes],
[Road Bikes])
) AS pivot_table;
Code language: SQL (Structured Query Language) (sql)
Here is the output:
45
DATA DEFINITION
DROP DATABASE
To remove an existing database from a SQL Server instance, you use the DROP
DATABASE statement.
The DROP DATABASE statement allows you to delete one or more databases with the
following syntax:
DROP DATABASE [ IF EXISTS ]
database_name
[,database_name2,...];
Before deleting a database, you must ensure the following important points:
First, the DROP DATABASE statement deletes the database and also the physical disk
files used by the database. Therefore, you should have a backup of the database in case
you want to restore it in the future.
Second, you cannot drop the database that is currently being used.
CREATE SCHEMAS
SQL Server CREATE SCHEMA to create a new schema in the current database.
A schema is a collection of database objects including tables, views, triggers, stored
procedures, indexes, etc. A schema is associated with a username which is known as the
schema owner, who is the owner of the logically related database objects.
A schema always belongs to one database
Two tables in two schemas can share the same name so you may
have hr.employees and sales.employees.
46
Built-in schemas in SQL Server
SQL Server provides us with some pre-defined schemas which have the same names as
the built-in database users and roles, for example: dbo, guest, sys,
and INFORMATION_SCHEMA.
Note that SQL Server reserves the sys and INFORMATION_SCHEMA schemas for
system objects, therefore, you cannot create or drop any objects in these schemas.
The default schema for a newly created database is dbo, which is owned by the dbo user
account. By default, when you create a new user with the CREATE USER command, the
user will take dbo as its default schema.
The following illustrates the simplified version of the CREATE SCHEMA statement:
CREATE SCHEMA schema_name
[AUTHORIZATION owner_name]
In this syntax,
First, specify the name of the schema that you want to create in the CREATE
SCHEMA clause.
Second, specify the owner of the schema after the AUTHORIZATION keyword.
SQL Server CREATE SCHEMA statement example
The following example shows how to use the CREATE SCHEMA statement to create
the customer_services schema:
CREATE SCHEMA customer_services;
GO
In this syntax:
target_schema_name is the name of a schema in the current database, into which you
want to move the object. Note that it cannot be SYS or INFORMATION_SCHEMA.
The entity_type can be Object, Type or XML Schema Collection. It defaults to Object.
The entity_type represents the class of the entity for which the owner is being changed.
object_name is the name of the securable that you want to move into
the target_schema_name.
If you move a stored procedure, function, view, or trigger, SQL Server will not change
the schema name of these securables. Therefore, it is recommended that you drop and re-
create these objects in the new schema instead of using the ALTER SCHEMA statement
for moving.
If you move an object e.g., table or synonym, SQL Server will not update the references
for these objects automatically. You must manually modify the references to reflect the
new schema name. For example, if you move a table that is referenced in a stored
procedure, you must modify the stored procedure to reflect the new schema name.
47
DROP SCHEMA [IF EXISTS] schema_name;
Code language: SQL (Structured Query Language) (sql)
In this syntax:
First, specify the name of the schema that you want to drop. If the schema contains any
objects, the statement will fail. Therefore, you must delete all objects in the schema
before removing the schema.
Second, use the IF EXISTS option to conditionally remove the schema only if the schema
exists. Attempting to drop a nonexisting schema without the IF EXISTS option will result
in an error.
CREATE TABLE statement
Tables are used to store data in the database.
Tables are uniquely named within a database and schema.
Each table contains one or more columns. A
nd each column has an associated data type that defines the kind of data it can store e.g.,
numbers, strings, or temporal data.
To create a new table, you use the CREATE TABLE statement as follows:
48
CREATE TABLE [database_name.][schema_name.]table_name (
pk_column data_type PRIMARY KEY,
column_1 data_type NOT NULL,
column_2 data_type,
...,
table_constraints
);
The following statement creates a new table named sales.visits to track the customer in-store
visits:
CREATE TABLE sales.visits (
visit_id INT PRIMARY KEY IDENTITY (1, 1),
first_name VARCHAR (50) NOT NULL,
last_name VARCHAR (50) NOT NULL,
visited_at DATETIME,
phone VARCHAR(20),
store_id INT NOT NULL,
FOREIGN KEY (store_id) REFERENCES sales.stores (store_id)
);
SQL Server IDENTITY
To create an identity column for a table, you use the IDENTITY property as follows:
IDENTITY[(seed,increment)]
In this syntax:
The seed is the value of the first row loaded into the table.
The increment is the incremental value added to the identity value of the previous row.
The default value of seed and increment is 1 i.e., (1,1). It means that the first row, which
was loaded into the table, will have the value of one, the second row will have the value
of 2 and so on
Suppose, you want the value of the identity column of the first row is 10 and incremental
value is 10, you use the following syntax:
IDENTITY (10,10)
ALTER TABLE
ADD column examples
49
ADD description VARCHAR (255) NOT NULL;
The following statement adds two new columns named amount and customer_name to
the sales.quotations table:
ALTER TABLE sales.quotations
ADD
amount DECIMAL (10, 2) NOT NULL,
customer_name VARCHAR (50) NOT NULL;
ALTER TABLE table_name
ALTER COLUMN column_name datatype;
The new data type must be compatible with the old one, otherwise, you will get a
conversion error in case the column has data and it fails to convert.
DROP COLUMN: To delete a column in a table, use the following syntax (notice that some
database systems don't allow deleting a column):
ALTER TABLE table_name
DROP COLUMN column_name;
Sometimes, you need to remove one or more unused or obsolete columns from a table. To do
this, you use the ALTER TABLE DROP COLUMN statement as follows:
ALTER TABLE table_name
DROP COLUMN column_name;
First, specify the name of the table from which you want to delete the column.
Second, specify the name of the column that you want to delete.
If the column that you want to delete has a CHECK constraint, you must delete the
constraint first before removing the column. Also, SQL Server does not allow you to
delete a column that has a PRIMARY KEY or a FOREIGN KEY constraint.
50
If you want to delete multiple columns at once, you use the following syntax:
In this syntax, you specify columns that you want to drop as a list of comma-separated columns
in the DROP COLUMN clause.
SQL Server ALTER TABLE DROP COLUMN examples
The price column has a CHECK constraint, therefore, you cannot delete it. If you try to execute
the following statement, you will get an error:
ALTER TABLE sales.price_lists
DROP COLUMN price;
Rename table
SQL Server does not have any statement that directly renames a table. However, it does
provide you with a stored procedure named sp_rename that allows you to change the
name of a table.
51
The following shows the syntax of using the sp_rename stored procedure for changing
the name of a table:
EXEC sp_rename 'old_table_name', 'new_table_name'
Note that both the old and new name of the table whose name is changed must be
enclosed in single quotations
Drop Table
SQL Server DROP TABLE statement to remove one or more tables from a database.
Sometimes, you want to remove a table that is no longer in use. To do this, you use the
following DROP TABLE statement:
When SQL Server drops a table, it also deletes all data, triggers, constraints, permissions
of that table. Moreover, SQL Server does not explicitly drop the views and stored
procedures that reference the dropped table. Therefore, to explicitly drop these dependent
objects, you must use the DROP VIEW and DROP PROCEDURE statement.
TRUNCATE TABLE
Sometimes, you want to delete all rows from a table. In this case, you typically use
the DELETE statement without a WHERE clause.
The following example creates a new table named customer_groups and inserts some rows into
the table:
CREATE TABLE sales.customer_groups (
group_id INT PRIMARY KEY IDENTITY,
group_name VARCHAR (50) NOT NULL
);
In this syntax, first, you specify the name of the table from which you want to delete all
rows. Second, the database name is the name of the database in which the table was
52
created. The database name is optional. If you skip it, the statement will delete the table
in the currently connected database.
The following statements first insert some rows into the customer_groups table and then
delete all rows from it using the TRUNCATE TABLE statement:
INSERT INTO sales.customer_groups (group_name)
VALUES
('Intercompany'),
('Third Party'),
('One time');
Temporary Table
Temporary tables are tables that exist temporarily on the SQL Server.
The temporary tables are useful for storing the immediate result sets that are accessed
multiple times.
The first way to create a temporary table is to use the SELECT INTO statement as shown below:
SELECT
select_list
INTO
temporary_table
FROM
table_name
The name of the temporary table starts with a hash symbol (#). For example, the following
statement creates a temporary table using the SELECT INTO statement:
SELECT
product_name,
list_price
INTO #trek_products --- temporary table
FROM
production.products
WHERE
brand_id = 9;
53
In this example, we created a temporary table named #trek_products with two columns derived
from the select list of the SELECT statement. The statement created the temporary table and
populated data from the production.products table into the temporary table.
Once you execute the statement, you can find the temporary table name created in the system
database named tempdb, which can be accessed via the SQL Server Management Studio using
the following path System Databases > tempdb > Temporary Tables as shown in the
following picture:
Manual Deletion
From the connection in which the temporary table created, you can manually remove the
temporary table by using the DROP TABLE statement:
DROP TABLE ##table_name;
Synonym
In SQL Server, a synonym is an alias or alternative name for a database object such as a
table, view, stored procedure, user-defined function, and sequence. A synonym provides
you with many benefits if you use it properly.
SQL Server CREATE SYNONYM statement syntax
To create a synonym, you use the CREATE SYNONYM statement as follows:
54
The following example uses the DROP SYNONYM statement to drop the orders
synonym:
DROP SYNONYM IF EXISTS orders;
SELECT INTO
The Select INTO statement in Sql Server , selects data from one table and inserts it into a
new table
The SELECT INTO statement creates a new table and inserts rows from the query into it.
The following SELECT INTO statement creates the destination table and copies rows,
which satisfy the WHERE condition, from the source table to the destination table:
Copy all rows and columns from an existing table into a new table. This is extremely
useful when you want to make a backup copy of existing table
SELECT
select_list
INTO
destination
FROM
source
[WHERE condition]
.
A) Using SQL Server SELECT INTO to copy table within the same database example
First, create a new schema for storing the new table.
CREATE SCHEMA marketing;
GO
Code language: SQL (Structured Query Language) (sql)
Second, create the marketing.customers table like the sales.customers table and copy all rows
from the sales.customers table to the marketing.customers table:
SELECT
*
INTO
55
marketing.customers
FROM
sales.customers;
Code language: SQL (Structured Query Language) (sql)
Third, query data from the the marketing.customers table to verify the copy:
SELECT
*
FROM
marketing.customers;
PRIMARY KEY constraint
A primary key is a column or a group of columns that uniquely identifies each row in a
table. You create a primary key for a table by using the PRIMARY KEY constraint.
If the primary key consists of only one column, you can define use PRIMARY
KEY constraint as a column constraint:
CREATE TABLE table_name (
pk_column data_type PRIMARY KEY,
...
);
In case the primary key has two or more columns, you must use the PRIMARY KEY constraint
as a table constraint:
CREATE TABLE table_name (
pk_column_1 data_type,
pk_column_2 data type,
...
PRIMARY KEY (pk_column_1, pk_column_2)
);
Each table can contain only one primary key. All columns that participate in the primary
key must be defined as NOT NULL. SQL Server automatically sets the NOT
NULL constraint for all the primary key columns if the NOT NULL constraint is not
specified for these columns.
SQL Server also automatically creates a unique clustered index (or a non-clustered index
if specified as such) when you create a primary key.
To make the event_id column as the primary key, you use the following ALTER
TABLE statement:
ALTER TABLE sales.events
ADD PRIMARY KEY(event_id);
56
Referential actions
The foreign key constraint ensures referential integrity. It means that you can only insert a row
into the child table if there is a corresponding row in the parent table.
Besides, the foreign key constraint allows you to define the referential actions when the row in
the parent table is updated or deleted as follows:
FOREIGN KEY (foreign_key_columns)
REFERENCES parent_table(parent_key_columns)
ON UPDATE action
ON DELETE action;
Code language: SQL (Structured Query Language) (sql)
The ON UPDATE and ON DELETE specify which action will execute when a row in the parent
table is updated and deleted. The following are permitted actions :
NO ACTION,
CASCADE,
SET NULL, and
SET DEFAULT
Delete or Update actions of rows in the parent table
If you delete one or more rows in the parent table, you can set one of the following actions:
NO ACTION: SQL Server raises an error and rolls back the delete or update action on
the row in the parent table.
CASCADE: SQL Server deletes or updates the rows in the child table that is
corresponding to the row deleted from the parent table.
SET NULL: SQL Server sets the rows in the child table to NULL if the corresponding
rows in the parent table are deleted or updated. To execute this action, the foreign key
columns must be nullable.
SET DEFAULT SQL Server sets the rows in the child table to their default values if the
corresponding rows in the parent table are deleted or updated. To execute this action, the
foreign key columns must have default definitions. Note that a nullable column has
a default value of NULL if no default value specified.
57
CHECK (Age>=18)
);
SQL Server UNIQUE constraint
SQL Server UNIQUE constraints allow you to ensure that the data stored in a column, or
a group of columns, is unique among the rows in a table.
The following statement creates a table whose data in the email column is unique among
the rows in the hr.persons table:
CREATE SCHEMA hr;
GO
In this syntax, you define the UNIQUE constraint as a column constraint. You can also define
the UNIQUE constraint as a table constraint, like this:
CREATE TABLE hr.persons(
person_id INT IDENTITY PRIMARY KEY,
first_name VARCHAR(255) NOT NULL,
last_name VARCHAR(255) NOT NULL,
email VARCHAR(255),
UNIQUE(email)
);
58
Behind the scenes, SQL Server automatically creates a UNIQUE index to enforce the uniqueness
of data stored in the columns that participate in the UNIQUE constraint. Therefore, if you
attempt to insert a duplicate row, SQL Server rejects the change and returns an error message
stating that the UNIQUE constraint has been violated.
NOT NULL constraint
The SQL Server NOT NULL constraints simply specify that a column must not assume
the NULL.
The following example creates a table with NOT NULL constraints for the
columns: first_name, last_name, and email:
CREATE SCHEMA hr;
GO
This page provides you with the commonly used system functions in SQL Server that return
objects, values, and settings in SQL Server:
59
SQL Server Window Functions calculate an aggregate value based on a group of rows and return
multiple rows for each group.
Name Description
CUME_DIST Calculate the cumulative distribution of a value in a set of values
DENSE_RANK Assign a rank value to each row within a partition of a result, with no gaps in rank values.
FIRST_VALUE Get the value of the first row in an ordered partition of a result set.
LAG Provide access to a row at a given physical offset that comes before the current row.
LAST_VALUE Get the value of the last row in an ordered partition of a result set.
LEAD Provide access to a row at a given physical offset that follows the current row.
NTILE Distribute rows of an ordered partition into a number of groups or buckets
PERCENT_RAN Calculate the percent rank of a value in a set of values.
K
RANK Assign a rank value to each row within a partition of a result set
ROW_NUMBER Assign a unique sequential integer to rows within a partition of a result set, the first row start
from 1.
60