Data Engineering SQL Window Functions 1719829356
Data Engineering SQL Window Functions 1719829356
SQL
75 Windowing
Functions
Shwetank Singh
GritSetGrow - GSGLearn.com
Shwetank Singh
GritSetGrow - GSGLearn.com
1
1. WHAT IS A WINDOW
FUNCTION IN SQL?
2
2. HOW DO YOU
DEFINE A WINDOW IN
SQL?
3
3. WHAT ARE THE
TYPES OF WINDOW
FUNCTIONS
SUPPORTED IN SQL
SERVER?
4
4. WHAT IS THE
DIFFERENCE BETWEEN
A WINDOW FUNCTION
AND AN AGGREGATE
FUNCTION?
An aggregate function groups
rows and returns a single result
for each group, while a window
function returns a result for each
row within the specified window.
5
5. HOW DO YOU USE
THE ROW_NUMBER()
FUNCTION?
6
6. EXPLAIN THE RANK()
FUNCTION.
7
7. WHAT IS THE
DIFFERENCE BETWEEN
RANK() AND
DENSE_RANK()?
8
8. DESCRIBE THE
NTILE() FUNCTION.
9
9. WHAT DOES THE
LAG() FUNCTION DO?
10
10. HOW DOES THE
LEAD() FUNCTION
DIFFER FROM LAG()?
11
11. EXPLAIN THE USE OF
FIRST_VALUE()
FUNCTION.
FIRST_VALUE(column) OVER
(PARTITION BY column ORDER BY
column) returns the first value in
an ordered set of values.
12
12. WHAT IS THE
LAST_VALUE()
FUNCTION USED FOR?
LAST_VALUE(column) OVER
(PARTITION BY column ORDER BY
column) returns the last value in
an ordered set of values.
13
13. HOW DO YOU
CALCULATE A MOVING
AVERAGE USING
WINDOW FUNCTIONS?
Use AVG(column) OVER
(PARTITION BY column ORDER BY
column ROWS BETWEEN n
PRECEDING AND CURRENT ROW) to
calculate a moving average.
14
14. DESCRIBE THE
PERCENT_RANK()
FUNCTION.
15
15. WHAT IS THE
CUME_DIST()
FUNCTION?
16
16. EXPLAIN
PERCENTILE_CONT()
FUNCTION.
PERCENTILE_CONT(n) WITHIN
GROUP (ORDER BY column) OVER
(PARTITION BY column) computes
a percentile based on continuous
distribution.
17
17. HOW DOES
PERCENTILE_DISC()
DIFFER FROM
PERCENTILE_CONT()?
PERCENTILE_DISC(n) WITHIN
GROUP (ORDER BY column) OVER
(PARTITION BY column) computes
a percentile based on discrete
distribution.
18
18. WHAT IS A FRAME
IN THE CONTEXT OF
WINDOW FUNCTIONS?
19
19. HOW DO YOU USE
THE ROWS BETWEEN
CLAUSE?
20
20. WHAT IS THE
RANGE CLAUSE USED
FOR IN WINDOW
FUNCTIONS?
The RANGE clause defines a
frame of rows based on the value
range of the rows, not their
physical positions.
21
21. HOW CAN YOU
EMULATE IGNORE
NULLS IN SQL SERVER?
22
22. EXPLAIN THE
CONCEPT OF
WINDOWING IN SQL.
23
23. WHAT ARE THE
BENEFITS OF USING
WINDOW FUNCTIONS?
24
24. CAN WINDOW
FUNCTIONS BE USED IN
THE WHERE CLAUSE?
25
25. HOW DO WINDOW
FUNCTIONS IMPROVE
PERFORMANCE?
26
26. WHAT IS THE
DIFFERENCE BETWEEN
PARTITION BY AND
ORDER BY IN WINDOW
FUNCTIONS?
PARTITION BY divides the result set
into partitions, and ORDER BY
specifies the order of rows within
each partition.
27
27. HOW DO YOU
CALCULATE RUNNING
TOTALS USING
WINDOW FUNCTIONS?
Use SUM(column) OVER
(PARTITION BY column ORDER BY
column ROWS UNBOUNDED
PRECEDING) for running totals.
28
28. DESCRIBE THE USE
OF COUNT() AS A
WINDOW FUNCTION.
29
29. HOW CAN YOU
USE WINDOW
FUNCTIONS TO
IDENTIFY GAPS AND
ISLANDS IN DATA?
30
30. WHAT ARE
HYPOTHETICAL SET
FUNCTIONS?
31
31. EXPLAIN THE
STRING_AGG()
FUNCTION.
STRING_AGG(column, delimiter)
WITHIN GROUP (ORDER BY
column) OVER (PARTITION BY
column) concatenates values
into a single string within each
partition.
32
32. HOW DOES GROUP
BY DIFFER FROM
PARTITION BY IN
WINDOW FUNCTIONS?
GROUP BY aggregates rows into
groups and reduces the result
set, while PARTITION BY defines
partitions for window functions
without reducing the result set.
33
33. WHAT IS THE
SIGNIFICANCE OF
ROWS UNBOUNDED
PRECEDING?
ROWS UNBOUNDED PRECEDING
specifies the frame from the first
row in the partition to the current
row.
34
34. HOW DO YOU
REMOVE DUPLICATES
USING WINDOW
FUNCTIONS?
Use ROW_NUMBER() OVER
(PARTITION BY column ORDER BY
column) to assign unique
numbers and then filter by the
row number to remove
duplicates.
35
35. CAN WINDOW
FUNCTIONS BE
NESTED?
36
36. HOW DO YOU
CALCULATE A MOVING
SUM USING WINDOW
FUNCTIONS?
Use SUM(column) OVER
(PARTITION BY column ORDER BY
column ROWS BETWEEN n
PRECEDING AND CURRENT ROW) to
calculate a moving sum.
37
37. EXPLAIN THE
CONCEPT OF FRAMING
IN WINDOW
FUNCTIONS.
Framing specifies the subset of
rows within a partition for the
window function, defined by
ROWS or RANGE clauses.
38
38. HOW DO YOU USE
THE OVER CLAUSE
WITH PARTITION BY
AND ORDER BY?
OVER (PARTITION BY column
ORDER BY column) defines the
window frame by partitioning and
ordering the rows.
39
39. WHAT IS THE
PURPOSE OF NTILE() IN
WINDOW FUNCTIONS?
40
40. HOW DO WINDOW
FUNCTIONS HANDLE
TIES IN RANKING?
41
41. DESCRIBE HOW TO
CALCULATE THE
MEDIAN USING
WINDOW FUNCTIONS.
Use PERCENTILE_CONT(0.5)
WITHIN GROUP (ORDER BY
column) OVER (PARTITION BY
column) to calculate the median.
42
42. WHAT ARE THE
LIMITATIONS OF
WINDOW FUNCTIONS
IN SQL SERVER?
Limitations include the inability to
use window functions in WHERE,
HAVING, and GROUP BY clauses,
and lack of support for some
standard features like IGNORE
NULLS.
43
43. HOW DO YOU
HANDLE NULL VALUES
IN WINDOW
FUNCTIONS?
Use conditional expressions or
subqueries to handle NULL values,
as window functions do not
directly support IGNORE NULLS.
44
44. WHAT IS THE
WINDOW CLAUSE
USED FOR?
45
45. HOW DO YOU
IMPLEMENT PAGING
USING WINDOW
FUNCTIONS?
Use ROW_NUMBER() OVER (ORDER
BY column) to assign row
numbers and then filter by the
desired page range.
46
46. WHAT ARE
STATISTICAL WINDOW
FUNCTIONS?
47
47. EXPLAIN THE
CONCEPT OF ROW
PATTERN
RECOGNITION IN SQL.
Row pattern recognition allows
the detection of patterns within
sequences of rows, a feature not
yet fully supported in SQL Server.
48
48. HOW CAN
WINDOW FUNCTIONS
BE OPTIMIZED IN SQL
SERVER?
Optimization techniques include
indexing, parallelism
improvements, and batch-mode
processing for better
performance.
49
49. DESCRIBE THE USE
OF FIRST_VALUE AND
LAST_VALUE IN
WINDOW FUNCTIONS.
FIRST_VALUE and LAST_VALUE
return the first and last values in
the window frame, respectively,
based on the specified ordering.
50
50. HOW DO YOU
PERFORM
CONDITIONAL
AGGREGATION USING
WINDOW FUNCTIONS?
Use CASE expressions within
aggregate functions to perform
conditional aggregation, e.g.,
SUM(CASE WHEN condition THEN
column ELSE 0 END) OVER
(PARTITION BY column).
51
51. HOW DO YOU CALCULATE
A CUMULATIVE SUM USING
WINDOW FUNCTIONS?
Example:
SELECT orderid, val, SUM(val)
OVER (ORDER BY orderid) AS
cumulative_sum
FROM Sales.OrderValues;
52
52. EXPLAIN THE
ROW_NUMBER() FUNCTION
WITH AN EXAMPLE.
Example:
SELECT orderid, val, ROW_NUMBER()
OVER (ORDER BY val DESC) AS
row_num
FROM Sales.OrderValues;
53
53. HOW DO YOU USE LAG()
TO FIND THE PREVIOUS
ROW'S VALUE?
Example:
SELECT orderid, val, LAG(val, 1) OVER
(ORDER BY orderid) AS prev_val
FROM Sales.OrderValues;
54
54. PROVIDE AN EXAMPLE OF
USING LEAD() TO FIND THE
NEXT ROW'S VALUE.
Example:
SELECT orderid, val, LEAD(val, 1) OVER
(ORDER BY orderid) AS next_val
FROM Sales.OrderValues;
55
55. HOW DO YOU
CALCULATE A MOVING
AVERAGE USING WINDOW
FUNCTIONS?
Example:
SELECT orderid, val,
AVG(val) OVER (ORDER BY orderid
ROWS BETWEEN 2 PRECEDING AND
CURRENT ROW) AS moving_avg
FROM Sales.OrderValues;
56
56. EXPLAIN HOW TO RANK
ROWS USING RANK().
Example:
SELECT orderid, val,
RANK() OVER (ORDER BY val DESC)
AS rnk
FROM Sales.OrderValues;
57
57. HOW DO YOU USE
DENSE_RANK() TO RANK
ROWS WITHOUT GAPS?
Example:
SELECT orderid, val, DENSE_RANK()
OVER (ORDER BY val DESC) AS
dense_rnk
FROM Sales.OrderValues;
58
58. DESCRIBE THE NTILE()
FUNCTION WITH AN
EXAMPLE.
Example:
SELECT orderid, val, NTILE(4) OVER
(ORDER BY val) AS quartile
FROM Sales.OrderValues;
59
59. HOW DO YOU USE
FIRST_VALUE() TO GET THE
FIRST VALUE IN A WINDOW?
Example:
SELECT orderid, val, FIRST_VALUE(val)
OVER (ORDER BY orderid) AS first_val
FROM Sales.OrderValues;
60
60. PROVIDE AN EXAMPLE
OF USING LAST_VALUE()
TO GET THE LAST VALUE IN
A WINDOW.
Example:
SELECT orderid, val, LAST_VALUE(val)
OVER (ORDER BY orderid ROWS
BETWEEN UNBOUNDED PRECEDING AND
UNBOUNDED FOLLOWING) AS last_val
FROM Sales.OrderValues;
61
61. HOW DO YOU CALCULATE
THE DIFFERENCE BETWEEN
CURRENT AND PREVIOUS
ROW VALUES USING LAG()?
Example:
SELECT orderid, val,
val - LAG(val, 1) OVER (ORDER BY
orderid) AS diff
FROM Sales.OrderValues;
62
62. EXPLAIN THE USE OF
PERCENT_RANK() WITH AN
EXAMPLE.
Example:
SELECT orderid, val,
PERCENT_RANK() OVER (ORDER BY
val) AS percent_rank
FROM Sales.OrderValues;
63
63. HOW DO YOU
CALCULATE THE CUMULATIVE
DISTRIBUTION USING
CUME_DIST()?
Example:
SELECT orderid, val,
CUME_DIST() OVER (ORDER BY val) AS
cumulative_dist
FROM Sales.OrderValues;
64
64. PROVIDE AN EXAMPLE OF
USING PERCENTILE_CONT()
TO CALCULATE A
PERCENTILE.
Example:
SELECT orderid, val,
PERCENTILE_CONT(0.5) WITHIN GROUP
(ORDER BY val) OVER () AS median
FROM Sales.OrderValues;
65
65. EXPLAIN HOW TO USE
PERCENTILE_DISC() TO
CALCULATE A PERCENTILE.
Example:
SELECT orderid, val,
PERCENTILE_DISC(0.5) WITHIN
GROUP (ORDER BY val) OVER () AS
median
FROM Sales.OrderValues;
66
66. HOW DO YOU
CALCULATE A RUNNING
TOTAL USING WINDOW
FUNCTIONS?
Example:
SELECT orderid, val, SUM(val) OVER
(ORDER BY orderid ROWS BETWEEN
UNBOUNDED PRECEDING AND CURRENT
ROW) AS running_total
FROM Sales.OrderValues;
67
67. DESCRIBE THE USE OF THE
RANGE CLAUSE WITH AN
EXAMPLE.
Example:
SELECT orderid, val, SUM(val) OVER
(ORDER BY orderid RANGE BETWEEN
INTERVAL '1' DAY PRECEDING AND
CURRENT ROW) AS range_sum
FROM Sales.OrderValues;
68
68. HOW DO YOU
IMPLEMENT A SLIDING
WINDOW CALCULATION?
Example:
SELECT orderid, val, AVG(val) OVER
(ORDER BY orderid ROWS BETWEEN 2
PRECEDING AND 2 FOLLOWING) AS
sliding_avg
FROM Sales.OrderValues;
69
69. PROVIDE AN EXAMPLE
OF USING LAG() AND LEAD()
TOGETHER.
Example:
SELECT orderid, val, LAG(val, 1) OVER
(ORDER BY orderid) AS prev_val,
LEAD(val, 1) OVER (ORDER BY orderid)
AS next_val FROM Sales.OrderValues;
70
70. EXPLAIN HOW TO
PARTITION DATA USING
WINDOW FUNCTIONS.
Example:
SELECT orderid, customerid, val,
SUM(val) OVER (PARTITION BY
customerid ORDER BY orderid) AS
customer_total
FROM Sales.OrderValues;
71
71. HOW DO YOU USE
WINDOW FUNCTIONS TO
CALCULATE THE
PERCENTAGE OF A TOTAL?
Example:
SELECT orderid, val, val * 100.0 /
SUM(val) OVER () AS
percent_of_total
FROM Sales.OrderValues;
72
72. PROVIDE AN EXAMPLE
OF USING NTILE() TO CREATE
DECILES.
Example:
SELECT orderid, val, NTILE(10) OVER
(ORDER BY val) AS decile
FROM Sales.OrderValues;
73
73. HOW DO YOU USE
WINDOW FUNCTIONS TO FIND
THE TOP N ROWS PER GROUP?
Example:
WITH RankedOrders
AS (SELECT orderid, customerid, val,
ROW_NUMBER() OVER (PARTITION BY
customerid ORDER BY val DESC) AS rank
FROM Sales.OrderValues)
SELECT * FROM RankedOrders
WHERE rank <= 3;
74
74. EXPLAIN HOW TO USE
WINDOW FUNCTIONS FOR
DATA DE-DUPLICATION.
Example:
WITH CTE AS
(SELECT orderid, val,
ROW_NUMBER() OVER (PARTITION BY
val ORDER BY orderid) AS rn
FROM Sales.OrderValues)
DELETE FROM CTE WHERE rn > 1;
75
75. HOW DO YOU PERFORM
CUMULATIVE AGGREGATION
USING WINDOW
FUNCTIONS?
Example:
SELECT orderid, val, SUM(val) OVER
(ORDER BY orderid ROWS BETWEEN
UNBOUNDED PRECEDING AND CURRENT
ROW) AS cumulative_sum
FROM Sales.OrderValues;
This query calculates the
cumulative sum of val for all
preceding rows including the
current row.
DATA ENGINEERING - SQL
WINDOW FUNCTIONS