0% found this document useful (0 votes)
82 views24 pages

ETL Interview+Prep

The document discusses common table expressions (CTEs) in SQL Server. It provides information on what a CTE is, how it can be used to simplify complex queries by breaking them into smaller pieces, and the benefits of using CTEs such as improved readability and reusability. The document also provides syntax examples for creating basic and multiple CTEs, and demonstrates how CTEs can be used to limit counts and calculate averages.

Uploaded by

vishal lalwani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views24 pages

ETL Interview+Prep

The document discusses common table expressions (CTEs) in SQL Server. It provides information on what a CTE is, how it can be used to simplify complex queries by breaking them into smaller pieces, and the benefits of using CTEs such as improved readability and reusability. The document also provides syntax examples for creating basic and multiple CTEs, and demonstrates how CTEs can be used to limit counts and calculate averages.

Uploaded by

vishal lalwani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 24

CTE in SQL Server

CTE stands for common table expression. A CTE is a temporary named result set.

A CTE is a temporary named result set that you can reference within another SELECT,
INSERT, UPDATE, or DELETE statement.

A CTE is similar to a view in that it is not stored as an object and lasts only for the
duration of the query. A CTE is a temporary result set that's defined within the
execution scope of a single statement. In other words, a CTE is like a view, except that
it's only available for the duration of the query.

By pure definition, a CTE is a temporary named result set.

 In practice, a CTE is a result set that remains in memory for the scope of a single
execution of a SELECT, INSERT, UPDATE, DELETE, or MERGEstatement.

A CTE allows you to define a temporary named result set that available temporarily in
the execution scope of a statement. CTEs can be used to simplify complex queries by
breaking them down into smaller, more manageable pieces. For example, let's say you
have a query that retrieves data from three different tables.

With a CTE, you can define each table as its own CTE and then reference those CTES in
your main query. This can make your code more readable and easier to maintain. There
are two ways to create a CTE in SQL Server: using the WITH clause or using CREATE TABLE.
The WITH clause is the simpler of the two methods and is the recommended way to
create CTEs.

CTEs can be used to

Simplify complex queries.


Improve performance by pre-computing intermediate results.
Create recursive queries (queries that reference themselves).

Syntax for Common Table Expressions


WITH expression_name[(column_name [,...])]
AS
(Define_CTE_query)
SQL_CTE_statement;

Simple CTE example


WITH CTE_ContractsTotals AS (
SELECT distinct CUSTOMER_ID,
SUM(AMOUNT) OVER (ORDER BY CUSTOMER_ID) AS TotalAmount
FROM contracts
)
SELECT * FROM CTE_ContractsTotals;

CTE example to limit counts and report averages


WITH Contracts_CTE (CtrCustomerID, NumberOfContracts)
AS
(
SELECT CUSTOMER_ID, COUNT(*)
FROM contracts
GROUP BY CUSTOMER_ID
)
SELECT AVG(NumberOfContracts) AS "Average"
FROM Contracts_CTE;

Multiple CTE definitions in a single query


WITH RegisterCTE (course_id, course_name, price, student_id)
AS (
select c.id, c.name, c.price, rc.student_id
from courses c, register_course rc
where c.id=rc.course_id
),
StudentCTE (id, name)
AS
(
SELECT s.id, s.name
FROM students s
)
SELECT x.student_id, y.name, x.course_name, x.price
FROM RegisterCTE x
JOIN StudentCTE y ON x.student_id = y.id
order by x.student_id;

The Benefits of Using a CTE


There are many benefits to using a CTE in SQL Server. Perhaps the most obvious
benefit is that it can make your code more readable and easier to follow. When you use
a CTE, you are essentially creating a temporary named result set that you can then
reference later in your query. This means that you can break up your query into
smaller, more manageable pieces.
 In some cases, using a CTE can even allow you to avoid using a subquery altogether.

CTEs can be used to encapsulate reusable code. If you find yourself writing similar
queries over and over again, you can create a CTE once and then reference it whenever
you need it.

SQL Server OFFSET FETCH


How to use the SQL Server OFFSET FETCH clauses to limit the number of rows returned
by a query.

The OFFSET and FETCH clauses are the options of the ORDER BY clause. They allow you to


limit the number of rows to be returned by a query.

The following illustrates the syntax of the OFFSET and FETCH clauses:

ORDER BY column_list [ASC |DESC]

OFFSET offset_row_count {ROW | ROWS}

FETCH {FIRST | NEXT} fetch_row_count {ROW | ROWS} ONLY

In this syntax:

 The OFFSET clause specifies the number of rows to skip before starting to return


rows from the query. The offset_row_count can be a constant, variable, or parameter
that is greater or equal to zero.
 The FETCH clause specifies the number of rows to return after the OFFSET clause
has been processed. The offset_row_count can a constant, variable or scalar that is
greater or equal to one.
 The OFFSET clause is mandatory while the FETCH clause is optional. Also,
the FIRST and NEXT are synonyms respectively so you can use them
interchangeably. Similarly, you can use the FIRST and NEXT interchangeably.
 The following illustrates the OFFSET and FETCH clauses:

 Note that you must use the OFFSET and FETCH clauses with the ORDER BY clause.


Otherwise, you will get an error.
 The OFFSET and FETCH clauses are preferable for implementing the query paging
solution than the TOP clause.

To skip the first 10 products and return the rest, you use the OFFSET clause as shown in
the following statement:

SELECT

product_name,

list_price

FROM
production.products

ORDER BY

list_price,

product_name

OFFSET 10 ROWS;

To skip the first 10 products and select the next 10 products, you use
both OFFSET and FETCH clauses as follows:

SELECT

product_name,

list_price

FROM

production.products

ORDER BY

list_price,

product_name

OFFSET 10 ROWS

FETCH NEXT 10 ROWS ONLY;

To get the top 10 most expensive products you use both OFFSET and FETCH clauses:

SELECT

product_name,
list_price

FROM

production.products

ORDER BY

list_price DESC,

product_name

OFFSET 0 ROWS

FETCH FIRST 10 ROWS ONLY;

In this example, the ORDER BY clause sorts the products by their list prices in descending
order. Then, the OFFSET clause skips zero row and the FETCH clause fetches the first 10
products from the list.

SQL Server SELECT TOP

The SELECT TOP clause allows you to limit the number of rows or percentage of rows
returned in a query result set.

Because the order of rows stored in a table is unspecified, the SELECT TOP statement is


always used in conjunction with the ORDER BY clause. Therefore, the result set is limited
to the first N number of ordered rows.
The following shows the syntax of the TOP clause with the SELECT statement:

SELECT TOP (expression) [PERCENT]

[WITH TIES]

FROM

table_name

ORDER BY

column_name;

In this syntax, the SELECT statement can have other clauses such as WHERE, JOIN, HAVING,


and GROUP BY.

 expression

Following the TOP keyword is an expression that specifies the number of rows to be


returned. The expression is evaluated to a float value if PERCENT is used, otherwise, it is
converted to a BIGINT value.

 PERCENT

The PERCENT keyword indicates that the query returns the first N percentage of rows,


where N is the result of the expression.

WITH TIES

The WITH TIES allows you to return more rows with values that match the last row in the
limited result set. Note that WITH TIES may cause more rows to be returned than you
specify in the expression.

For example, if you want to return the most expensive products, you can use the TOP 1.
However, if two or more products have the same prices as the most expensive product,
then you miss the other most expensive products in the result set.

To avoid this, you can use TOP 1 WITH TIES. It will include not only the first expensive
product but also the second one, and so on.
SQL Server SELECT TOP examples

1) Using TOP with a constant value

The following example uses a constant value to return the top 10 most expensive
products.

SELECT TOP 10

product_name,

list_price

FROM

production.products

ORDER BY

list_price DESC;

2) Using TOP to return a percentage of rows

The following example uses PERCENT to specify the number of products returned in the
result set. The production.products table has 321 rows, therefore, one percent of 321 is a
fraction value ( 3.21), SQL Server rounds it up to the next whole number which is four ( 4)
in this case.

SELECT TOP 1 PERCENT

product_name,

list_price

FROM
production.products

ORDER BY

list_price DESC;

3) Using TOP WITH TIES to include rows that match the values in the last row

The following statement returns the top three most expensive products:

SELECT TOP 3 WITH TIES

product_name,

list_price

FROM

production.products

ORDER BY

list_price DESC;

The output is as follows:

In this example, the third expensive product has a list price of 6499.99. Because the
statement used TOP WITH TIES, it returned three more products whose list prices are the
same as the third one.
Nesting subquery

A subquery can be nested within another subquery. SQL Server supports up to 32 levels
of nesting. Consider the following example:

SELECT

product_name,

list_price

FROM

production.products

WHERE

list_price > (

SELECT

AVG (list_price)

FROM

production.products

WHERE

brand_id IN (

SELECT

brand_id

FROM

production.brands

WHERE

brand_name = 'Strider'
OR brand_name = 'Trek'

ORDER BY

list_price;

SQL Server subquery types

You can use a subquery in many places:

 In place of an expression
 With IN or NOT IN
 With ANY or ALL
 With EXISTS or NOT EXISTS
 In UPDATE, DELETE, orINSERT statement
 In the FROM clause

SQL Server subquery is used in place of an expression

If a subquery returns a single value, it can be used anywhere an expression is used.

In the following example, a subquery is used as a column expression


named max_list_price in a SELECT statement.

SELECT

order_id,

order_date,

SELECT

MAX (list_price)

FROM
sales.order_items i

WHERE

i.order_id = o.order_id

) AS max_list_price

FROM

sales.orders o

order by order_date desc;

SQL Server subquery is used with IN operator

A subquery that is used with the IN operator returns a set of zero or more values. After
the subquery returns values, the outer query makes use of them.

The following query finds the names of all mountain bikes and road bikes products that
the Bike Stores sell.

SELECT

product_id,

product_name

FROM

production.products

WHERE

category_id IN (

SELECT

category_id

FROM

production.categories
WHERE

category_name = 'Mountain Bikes'

OR category_name = 'Road Bikes'

);

SQL Server subquery is used with ANY operator

The subquery is introduced with the ANY operator has the following syntax:

scalar_expression comparison_operator ANY (subquery)

Assuming that the subquery returns a list of value v1, v2, … vn. The ANY operator
returns TRUE if one of a comparison pair (scalar_expression, vi) evaluates to TRUE; otherwise,
it returns FALSE.

SELECT

product_name,

list_price

FROM

production.products

WHERE

list_price >= ANY (

SELECT

AVG (list_price)

FROM

production.products
GROUP BY

brand_id

SQL Server subquery is used with ALL operator

The ALL operator has the same syntax as the ANY operator:

scalar_expression comparison_operator ALL (subquery)

The ALL operator returns TRUE if all comparison pairs (scalar_expression, vi) evaluate


to TRUE; otherwise, it returns FALSE.

The following query finds the products whose list price is greater than or equal to the
average list price returned by the subquery:

SELECT

product_name,

list_price

FROM

production.products

WHERE

list_price >= ALL (

SELECT

AVG (list_price)

FROM
production.products

GROUP BY

brand_id

SQL Server subquery is used with EXISTS or NOT EXISTS

The following illustrates the syntax of a subquery introduced with EXISTS operator:

WHERE [NOT] EXISTS (subquery)

The EXISTS operator returns TRUE if the subquery return results; otherwise, it


returns FALSE.

The NOT EXISTS negates the EXISTS operator.

SELECT

customer_id,

first_name,

last_name,

city

FROM

sales.customers c

WHERE

EXISTS (

SELECT

customer_id

FROM
sales.orders o

WHERE

o.customer_id = c.customer_id

AND YEAR (order_date) = 2017

ORDER BY

first_name,

last_name;

If you use the NOT EXISTS instead of EXISTS, you can find the customers who did not buy
any products in 2017.

SELECT

customer_id,

first_name,

last_name,

city

FROM

sales.customers c

WHERE

NOT EXISTS (

SELECT

customer_id

FROM

sales.orders o

WHERE

o.customer_id = c.customer_id
AND YEAR (order_date) = 2017

ORDER BY

first_name,

last_name;

SQL Server Indexes


Indexes are special data structures associated with tables or views that help speed up
the query. SQL Server provides two types of indexes: clustered index and non-clustered
index.

A clustered index stores data rows in a sorted structure based on its key values. Each
table has only one clustered index because data rows can be only sorted in one order. A
table that has a clustered index is called a clustered table.

Cluster index is a type of index which sorts the data rows in the table on their
key values. In the Database, there is only one clustered index per table.

A clustered index defines the order in which data is stored in the table which
can be sorted in only one way. So, there can be an only a single clustered
index for every table. In an RDBMS, usually, the primary key allows you to
create a clustered index based on that specific column/

Whenever you apply clustered indexing in a table, it will perform sorting in that
table only. You can create only one clustered index in a table like primary key.
Clustered index is as same as dictionary where the data is arranged by
alphabetical order. 

You can have only one clustered index in one table, but you can have one
clustered index on multiple columns, and that type of index is called composite
index. 
SQL Server CREATE CLUSTERED INDEX syntax

The syntax for creating a clustered index is as follows:

CREATE CLUSTERED INDEX index_name

ON schema_name.table_name (column_list);

In this syntax:

 First, specify the name of the clustered index after the CREATE CLUSTERED
INDEX clause.
 Second, specify the schema and table name on which you want to create the
index.
 Third, list one or more columns included in the index.

What is Non-clustered index?


A Non-clustered index stores the data at one location and indices at another
location. The index contains pointers to the location of that data. A single table
can have many non-clustered indexes as an index in the non-clustered index
is stored in different places.

For example, a book can have more than one index, one at the beginning
which displays the contents of a book unit wise while the second index shows
the index of terms in alphabetical order.
A non-clustering index is defined in the non-ordering field of the table. This
type of indexing method helps you to improve the performance of queries that
use keys which are not assigned as a primary key. A non-clustered index
allows you to add a unique key for a table.

Index is a lookup table associated with actual table or view that is used by the database
to improve the data retrieval performance timing. In index , keys are stored in a
structure (B-tree) that enables SQL Server to find the row or rows associated with the
key values quickly and efficiently. Index gets automatically created if primary key and
unique constraint is defined on the table. There are two types of index −

 Clustered Index - Table is created with primary key constraints then database
engine automatically create clustered index . In this data sort or store in the table
or view based on their key and values.
 Non-Clustered Index - Table is created with UNIQUE constraints then database
engine automatically create non-clustered index . A nonclustered index contains
the nonclustered index key values and each key value entry has a pointer to the
data row that contains the key value.
Sr. Key Clustered Index Non-Clustered Index
No.

1 Basic Its created on primary key It can be created on any key

2       Ordering Store data physically It don’t impact the order


according to the order

3 Number of Only one clustered index There can be any number of


index can be there in a table non-clustered indexes in a
table

4 Space No extra space is required Extra space is required to store


to store logical structure logical structure

5 Performance Data retrieval  is faster than Data update is faster than
non-cluster index clustered index

Key Difference between Clustered and Non-clustered Index


 A cluster index is a type of index that sorts the data rows in the table on
their key values, whereas the Non-clustered index stores the data at
one location and indices at another location.
 The cluster index doesn’t require additional disk space, whereas the
Non-clustered index requires additional disk space.
 Cluster index offers faster data access, on the other hand, the Non-
clustered index is slower.
To create a non-clustered index, you use the CREATE INDEX statement:

CREATE [NONCLUSTERED] INDEX index_name


ON table_name(column_list);

In this syntax:

 First, specify the name of the index after the CREATE NONCLUSTERED INDEX clause.
Note that the NONCLUSTERED keyword is optional.
 Second, specify the table name on which you want to create the index and a list
of columns of that table as the index key columns.
Oracle Correlated Subquery
Unlike the above subquery, a correlated subquery is a subquery that uses values from the outer
query. In addition, a correlated subquery may be evaluated once for each row selected by the
outer query. Because of this, a query that uses a correlated subquery could be slow.

A correlated subquery is also known as a repeating subquery or a synchronized subquery.

Oracle correlated subquery examples

Let’s take some examples of the correlated subqueries to better understand how they
work.

A) Oracle correlated subquery in the WHERE clause example

The following query finds all products whose list price is above average for their
category.

SELECT
product_id,
product_name,
list_price
FROM
products p
WHERE
list_price > (
SELECT
AVG( list_price )
FROM
products
WHERE
category_id = p.category_id
);
Code language: SQL (Structured Query Language) (sql)
In the above query, the outer query is:

SELECT
product_id,
product_name,
list_price
FROM
products p
WHERE
list_price >
Code language: SQL (Structured Query Language) (sql)

And the correlated subquery is:

SELECT
AVG( list_price )
FROM
products
WHERE
category_id = p.category_id
Code language: SQL (Structured Query Language) (sql)

For each product from the products table, Oracle has to execute the correlated subquery
to calculate the average price by category.

B) Oracle correlated subquery in the SELECT clause example

The following query returns all products and the average standard cost based on the
product category:

SELECT
product_id,
product_name,
standard_cost,
ROUND(
(
SELECT
AVG( standard_cost )
FROM
products
WHERE
category_id = p.category_id
),
2
) avg_standard_cost
FROM
products p
ORDER BY
product_name;
Code language: SQL (Structured Query Language) (sql)

For each product from the products table, Oracle executed the correlated subquery to
calculate the average standard of cost for the product category.

Note that the above query used the ROUND() function to round the average standard
cost to two decimals.

C) Oracle correlated subquery with the EXISTS operator example

We usually use a correlated subquery with the EXISTS operator. For example, the


following statement returns all customers who have no orders:

SELECT
customer_id,
name
FROM
customers
WHERE
NOT EXISTS (
SELECT
*
FROM
orders
WHERE
orders.customer_id = customers.customer_id
)
ORDER BY
name;

Oracle EXISTS
The Oracle EXISTS operator is a Boolean operator that returns either true or false.
The EXISTS operator is often used with a subquery to test for the existence of rows:

SELECT
*
FROM
table_name
WHERE
EXISTS(subquery);
Code language: SQL (Structured Query Language) (sql)

The EXISTS operator returns true if the subquery returns any rows, otherwise, it returns
false. In addition, the EXISTS operator terminates the processing of the subquery once
the subquery returns the first row.

You might also like