Lesson_09__Window_Functions_in_SQL
Lesson_09__Window_Functions_in_SQL
Learning Objectives
The window function is like an SQL function that takes input values from a window of one or more
rows of a SELECT statement’s result set.
The window functions perform various operations on a group of rows and provide an aggregated
value for each row with a unique identity.
General Syntax
Syntax
window_function_name(expression)OVER([partition_definition]
[order_definition] [frame_definition] )
Partition
Clause
Clauses
Frame Order By
Clause Clause
Partition Clause
The partition clause is used to divide or split the rows into partitions, and
the partition boundary is used to split two partitions.
Syntax
PARTITION BY
<expression>[{,<expression>...}]
Order By Clause
Syntax
ORDER BY <expression> [ASC|DESC],
[{,<expression>...}]
Frame Clause
Frame clause is defined as subset of the current position. It allows to move the subset
within a partition based on the position of the current row in its partition.
Syntax
frame_unit
{<frame_start>|<frame_between>}
The frame unit can be a row or range that specifies the kind of relationship between
the current row and frame row.
Frame Clause
Keywords Meaning
Frame unit Rows: It assigns row number for offset of current and frame row.
Frame unit Range: It assigns row values for offset of current and frame row.
Problem Scenario:
Objective:
You are required to retrieve the employee ID, first name, role, department, and employee rating by
calculating the maximum employee rating using PARTITION BY and MAX function on department
and employee rating fields respectively.
Instructions:
Refer to the employee dataset given in the course resource section in LMS and create an employee
table using fields mentioned in dataset. Insert the values accordingly to perform the above
objectives.
Use Case for Window Functions
Use Case for Window Functions
Rating for the employee (1: Not achieved any goals, 2: Below expectation,
EMP_RATING 3: Meeting expectation, 4: Excellent performance,
5: Overachiever)
Solution:
By executing this query, the HR can identify the maximum rating of the
employee in a department.
Use Case for Window Functions
Output:
Aggregate Window Functions
Aggregate Window Functions
Syntax
window_function ( [ ALL ] expression )
OVER ( [ PARTITION BY expr_list ] [
ORDER BY order_list frame_clause ] )
Arguments in Aggregate Window Functions
Keywords Meaning
ALL ALL helps to maintain all duplicate values from the expression.
AVG()
01 MIN()
02
COUNT() 05
03
0 MAX()
4 04
SUM()
Use Case for MIN and MAX
Problem Scenario:
The HR of a company wants to identify the minimum and the maximum salary of the employees in
a role.
Objective:
You are required to display the employee’s ID, first name, role, and salary by finding the minimum
and maximum salary of the employees using PARTITION BY clause, MIN, and MAX functions on
role and salary fields respectively.
Instructions:
Refer to the employee table which is created and perform the above objectives.
Use Case for MIN and MAX
Solution:
/* SELECT EMP_ID, FIRST_NAME, ROLE, SALARY and calculate minimum, maximum salary of the
employees using PARTITION CLAUSE on the role field, MIN , MAX function. */
By executing this query, the HR can identify the maximum and the
minimum salary for a given role.
Use Case for MIN and MAX
Output:
Use Case for AVG and COUNT
Problem Scenario:
The HR of a company wants to identify the average performance of the employee's department-
wise and also find the total number of records in a department.
Objective:
You are required to display the employee’s ID, first name, department, and employee rating by
calculating the average employee rating and the total number of records in a department using
PARTITION BY clause, AVG, and COUNT functions on department and employee rating fields
respectively.
Instructions:
Refer to the employee table which is created and perform the above objectives.
Use Case for AVG and COUNT
Solution :
Output:
Use Case for SUM
Problem Scenario:
Objective:
You are required to display the employee’s Id, first name, department, and employee rating by
calculating the total employee rating in a department using PARTITION BY clause and SUM
function on the department and the employee rating fields respectively.
Instructions:
Refer to the employee table which is created and perform the above objectives.
Use Case for SUM
Solution:
/* SELECT EMP_ID, FIRST_NAME, DEPT,EMP RATING and calculate the total employee rating in
a department using PARTITION CLAUSE on a dept and SUM function. */
Output:
Assisted Practice: Aggregate Window Functions
Duration: 20 min
Problem Statement: You are required to calculate the total, average, maximum, and minimum salary
of the employee by grouping the departments from the employee table.
Assisted Practice: Aggregate Window Functions
Steps to be performed:
Step 1: Creating the employee table and inserting values in it:
CREATE
CREATE TABLE lep_7.employee ( emp_id int NOT NULL, f_name varchar(45) NULL,
l_name varchar(45) NOT NULL, job_id varchar(45) NOT NULL, salary
decimal(8,2) NOT NULL, manager_id int NOT NULL, dept_id varchar(45) NOT
NULL, PRIMARY KEY(emp_id));
INSERT
INSERT INTO lep_7. employee
(emp_id,f_name,l_name,job_id,salary,manager_id,dept_id) VALUES
('103','krishna','gee','125','500000','05','44');
Assisted Practice: Aggregate Window Functions
QUERY
Output:
Ranking Window Functions
Ranking Window Functions and Its Types
Ranking window functions specify the rank for individual fields as per the
categorization.
Ranking
window
functions
Definition Syntax
DENSE_RANK() OVER (
• It assigns a rank to every row in a
partition based on the ORDER BY PARTITION BY
clause. <expression>[{,<expression>...}]
ORDER BY <expression>
• It assigns the same rank for equal [ASC|DESC], [{,<expression>...}])
values.
Definition Syntax
RANK() OVER (
• Rank helps to assign a rank to all PARTITION BY
rows within every partition. <expr1>[{,<expr2>...}]
ORDER BY <expr1>
[ASC|DESC], [{,<expr2>...}]
• The first row of the rank will be 1. )
Problem Scenario:
The HR of a company wants to assign a rank for each employee based on their employee rating.
Objective:
You are required to display the employee’s ID, first name, department, and employee rating by
assigning a rank to all the employees based on their employee rating using ORDER BY clause,
RANK, and DENSE RANK functions on the employee rating field.
Instructions:
Refer to the employee table which is created and perform the above objectives.
Use Case for Rank and Dense Rank
Solution:
/* SELECT EMP_ID, FIRST_NAME, DEPT,EMP RATING and assign a rank to all the employee
based on their employee rating using Rank and Dense Rank. */
Output:
Row Number
Definition Syntax
Problem Scenario:
The IT department of a company wants to assign an asset number for each employee based on
their employee ID in ascending order.
Objective:
You are required to display the employee’s ID, first name, role, and department by assigning a
number to each employee in ascending order of their employee ID using ORDER BY clause and
ROW NUMBER function on the employee ID field.
Instructions:
Refer to the employee table which is created and perform the above objective.
Use Case for Row Number
Solution:
/* SELECT EMP_ID, FIRST_NAME, ROLE, DEPT and assign assetnumber to all the employee in
ascending order of their employee ID. */
Output:
Percent Rank
Definition Syntax
Problem Scenario:
The HR of a company wants to calculate the overall percentile of the employee rating in a
department.
Objective:
You are required to display employee’s ID, first name, role, department, and employee rating by
calculating the percentile of the employee rating in a department using ORDER BY clause and
PERCENT RANK function on an employee rating field.
Instructions:
Refer to the employee table which is created and perform the above objective.
Use Case for Percent Rank
Solution:
Output:
Miscellaneous Window Functions
Types of Miscellaneous Window Functions
FIRST VALUE ()
Definition Syntax
Problem Scenario:
The HR department of an organization aims to find the employee ID of the employee with the
highest experience by sorting their experience in descending order.
Objective:
You are required to display the employee ID, first name, and experience, as well as identify the
employee ID of the first employee by sorting the experience in descending order using the
ORDER BY clause and first value function on the experience and employee ID fields respectively.
Instructions:
Refer to the employee table which is created and perform the above objective.
Use Case for First Value Function
Solution:
/* SELECT EMP_ID,FIRST_NAME,EXP and determine the highest experience in the EMP_ID based
on descending order of the experience. */
By executing this query, the HR can identify the employee ID with the
highest experience.
Use Case for First Value Function
Output:
Last Value Function
Definition Syntax
Problem Scenario:
The HR of a company wants to determine the last employee ID by sorting the experience in
ascending order.
Objective:
You are required to display the employee’s ID, first name, and experience and determine the last
employee ID by sorting the experience in ascending order using ORDER BY clause and last value
function on the experience and employee ID field respectively.
Instructions:
Refer to the employee table which is created and perform the above objective.
Use Case for Last Value Function
Solution:
/* SELECT EMP_ID,FIRST_NAME,EXP and determine the last value in the EMP_ID based on
ascending order of the experience. */
By executing this query, the HR can identify the last value of the
employee ID based on their experience.
Use Case for Last Value Function
Output:
NTH Value Function
Definition Syntax
NTH_VALUE(expression, N)
The NTH value function acquires FROM FIRST
OVER (
a value from the Nth row of an partition_clause
ordered group of rows. order_clause
frame_clause
)
Use Case for NTH Value Function
Problem Scenario:
The HR of a company wants to identify the third-highest experience among employees in the company.
Objective:
You are required to display the employee’s ID, first name, and experience by calculating the third-highest
experience among employees using ORDER BY clause and NTH value function in descending order of
experience field.
Instructions:
Refer to the employee table which is created and perform the following objective.
Use Case for NTH Value Function
Solution:
Output:
NTILE Function
Definition Syntax
NTILE(n) OVER (
NTILE(n) PARTITION
OVER (
NTILE function breaks the rows PARTITION BY
BY
<expression>[{,<expression>..
into a sorted partition in a <expression>[{,<expression
.}]
certain number of groups. >...}] ORDER BY <expression>
ORDER BY <expression>
[ASC|DESC],
[{,<expression>...}]
[ASC|DESC],
)
[{,<expression>...}]
)
N_TH Value
Problem Scenario:
The HR of a company wants to sort the employee table in ascending based on their
experience in four partitions.
Objective:
You are required to display all the details by sorting the experience into four partitions in
ascending order using the ORDER BY clause and NTILE function on an experience field.
Instructions:
Refer to the employee table which is created and perform the above objective.
Use Case for NTILE Function
Solution:
/* SELECT all the details in the employee table by sorting in ascending order of the
experience into four partitions using ORDER BY on EXP and NTILE function. */
Output:
Cume Dist Function
Definition Syntax
CUME_DIST( ) OVER ( [
The Cume Dist function partition_by_clause ]
calculates the cumulative order_by_clause )
distribution of a number in a
group of values.
Use Case for Cume Dist Function
Problem Scenario:
The HR of a company wants to sort the employee data based on their experience in ascending
order and calculate the cumulative distribution on the employee table.
Objective:
You are required to display the employee’s ID, first name, and experience by calculating the
cumulative distribution of the experience with the help of ROW NUMBER using ORDER BY, ROW
NUMBER, and CUME DIST function on an experience field.
Instructions:
Refer to the employee table which is created and perform the above objective.
Use Case for Cume Dist Function
Solution:
Output:
Lead Function
Definition Syntax
Definition Syntax
Problem Scenario:
The HR of a company wants to ignore the two lowest and highest experiences of the employees.
Objective:
You are required to display the employee’s ID, first name, experience and sort the employees in
ascending order of their experience. Ignore the two lowest experiences using LEAD and two
highest experiences using LAG to determine the median of the employee experience.
Instructions:
Refer to the employee table which is created and perform the above objective.
Use Case for Lead and Lag Function
Solution:
/* SELECT EMP_ID,FIRST_NAME, EXP and ignore the two lowest and highest experience using
LEAD and LAG function. */
Output:
Assisted Practice: Ranking and Miscellaneous
Window Functions
Duration:15 min
Problem Statement: You are required to identify the rank and row number and calculate the
cumulative distribution and percentile score based on the student score from the marksheet table.
Assisted Practice: Ranking and Miscellaneous
Window Functions
Steps to be performed:
Step 1: Creating the marksheet table and inserting values in it:
CREATE
CREATE TABLE marksheet ( score INT NOT NULL, year INT NULL, class
varchar(45) NULL, ranking varchar(45) NULL, s_id INT NOT NULL );
INSERT
QUERY
Output:
Knowledge Check
Knowledge
Check
What is the result of Window Functions ?
1
B. Group of values
C. Sorted values
B. Group of values
C. Sorted values
Window functions perform various operations on a group of rows and provide an aggregated value for
each row.
Knowledge
Check
Which of the following are the clauses of window functions in MySQL ?
2
The types of clauses are PARTITION Clause, FRAME Clause, and ORDER BY Clause.
Knowledge
Check Which of the following window functions is used to calculate the cumulative
3 distribution of a column's values?
A. ROW_NUMBER()
B. DENSE_RANK()
C. NTILE()
D. CUME_DIST()
Knowledge
Check Which of the following window functions is used to calculate the cumulative
3 distribution of a column's values?
A. ROW_NUMBER()
B. DENSE_RANK()
C. NTILE()
D. CUME_DIST()
The CUME_DIST() function is used to calculate the cumulative distribution of a column's values within
a partition.
Knowledge
Check
Which ranking window function returns a value from zero to one ?
4
A. NTH value
B. Percent rank
C. N title
D. Row number
Knowledge
Check
Which ranking window function returns a value from zero to one ?
4
A. NTH value
B. Percent rank
C. N title
D. Row number
Problem statement:
You are working for a gadget-selling company. Your manager has asked you
to perform an end-to-end analysis of all types of products sold by your
organization, i.e., from creating a table and inserting data to extracting the
useful data points using SQL.
Objective:
To analyze the different products across the available categories based on the
prices
Lesson-End Project: Multi-Brand Gadget-Selling Company
Tasks to be performed:
Tasks to be performed:
4. Fetch the price of the most and least expensive product under the
Headphone category
5. Rank all the products based on the price with the RANK() as well
DENSE_RANK() functions and return the records where RANK() or
DENSE_RANK() has a value equal to 5 or 6
Note: You can also give alias to the newly generated rank and dense rank
columns.
Lesson-End Project: Multi-Brand Gadget-Selling Company
Tasks to be performed: