0% found this document useful (0 votes)
1 views26 pages

Summchpt 3

Uploaded by

Rahul Bhole
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views26 pages

Summchpt 3

Uploaded by

Rahul Bhole
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 26

Aggregate window

functions
POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS

Michel Semaan
Data Scientist
Source table
Query Result

SELECT | Year | Medals |


Year, COUNT(*) AS Medals |------|--------|
FROM Summer_Medals | 1992 | 13 |
WHERE | 1996 | 5 |
Country = 'BRA' | 2004 | 18 |
AND Medal = 'Gold' | 2008 | 14 |
AND Year >= 1992 | 2012 | 14 |
GROUP BY Year
ORDER BY Year ASC;

POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS


Aggregate functions
MAX Query MAX Result

WITH Brazil_Medals AS (...) 18

SELECT MAX(Medals) AS
Max_Medals FROM Brazil_Medals;

SUM Query SUM Result

WITH Brazil_Medals AS (...) 64

SELECT SUM(Medals) AS
Total_Medals FROM Brazil_Medals;
POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS
MAX Window function
Query Result

WITH Brazil_Medals AS (...) | Year | Medals | Max_Medals |


|------ |-------- |------------ |
SELECT | 1992 | 13 | 13 |
Year, Medals, | 1996 | 5 | 13 |
MAX(Medals) | 2004 | 18 | 18 |
OVER (ORDER BY Year ASC) AS Max_Medals | 2008 | 14 | 18 |
FROM Brazil_Medals; | 2012 | 14 | 18 |

POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS


SUM Window function
Query Result

WITH Brazil_Medals AS (...) | Year | Medals | Medals_RT |


|------ |-------- |----------- |
SELECT | 1992 | 13 | 13 |
Year, Medals, | 1996 | 5 | 18 |
SUM(Medals) OVER (ORDER BY Year ASC) AS Medals_RT | 2004 | 18 | 36 |
FROM Brazil_Medals; | 2008 | 14 | 50 |
| 2012 | 14 | 64 |

POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS


Partitioning with aggregate window functions
Query Query

WITH Medals AS (...) WITH Medals AS (...)


SELECT Year, Country, Medals, SELECT Year, Country, Medals,
SUM(Medals) OVER (...) SUM(Medals) OVER (PARTITION BY Country ...)
FROM Medals; FROM Medals;

Result Result

| Year | Country | Medals | Medals_RT | | Year | Country| Medals | Medals_RT |


|------ |---------|--------|----------- | |------ |---------|--------|----------- |
| 2004 | BRA | 18 | 18 | | 2004 | BRA | 18 | 18 |
| 2008 | BRA | 14 | 32 | | 2008 | BRA | 14 | 32 |
| 2012 | BRA | 14 | 46 | | 2012 | BRA | 14 | 46 |
| 2004 | CUB | 31 | 77 | | 2004 | CUB | 31 | 31 |
| 2008 | CUB | 2 | 79 | | 2008 | CUB | 2 | 33 |
| 2012 | CUB | 5 | 84 | | 2012 | CUB | 5 | 38 |
POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS
Let's practice!
POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS
Frames
POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS

Michel Semaan
Data Scientist
Motivation
LAST_VALUE

LAST_VALUE(City) OVER (
ORDER BY Year ASC
RANGE BETWEEN
UNBOUNDED PRECEDING AND
UNBOUNDED FOLLOWING
) AS Last_City

Frame: RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING


Without the frame, LAST_VALUE would return the row's value in the City column
By default, a frame starts at the beginning of a table or partition and ends at the current row

POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS


ROWS BETWEEN
ROWS BETWEEN [START] AND [FINISH]
rows before the current row
n PRECEDING : n

CURRENT ROW : the current row

n FOLLOWING : n rows after the current row

Examples

ROWS BETWEEN 3 PRECEDING AND CURRENT ROW

ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING

ROWS BETWEEN 5 PRECEDING AND 1 PRECEDING


POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS
Source table
Query Result

SELECT | Year | Medals |


Year, COUNT(*) AS Medals |------|--------|
FROM Summer_Medals | 1996 | 36 |
WHERE | 2000 | 66 |
Country = 'RUS' | 2004 | 47 |
AND Medal = 'Gold' | 2008 | 43 |
GROUP BY Year | 2012 | 47 |
ORDER BY Year ASC;

POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS


MAX without a frame
Query Result

WITH Russia_Medals AS (...) | Year | Medals | Max_Medals |


|------|--------|------------ |
SELECT | 1996 | 36 | 36 |
Year, Medals, | 2000 | 66 | 66 |
MAX(Medals) | 2004 | 47 | 66 |
OVER (ORDER BY Year ASC) AS Max_Medals | 2008 | 43 | 66 |
FROM Russia_Medals | 2012 | 47 | 66 |
ORDER BY Year ASC;

POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS


MAX with a frame
Query Result

WITH Russia_Medals AS (...) | Year | Medals | Max_Medals | Max_Medals_Last |


|------|--------|------------ |----------------- |
SELECT | 1996 | 36 | 36 | 36 |
Year, Medals, | 2000 | 66 | 66 | 66 |
MAX(Medals) | 2004 | 47 | 66 | 66 |
OVER (ORDER BY Year ASC) AS Max_Medals, | 2008 | 43 | 66 | 47 |
MAX(Medals) | 2012 | 47 | 66 | 47 |
OVER (ORDER BY Year ASC
ROWS BETWEEN
1 PRECEDING AND CURRENT ROW)
AS Max_Medals_Last
FROM Russia_Medals
ORDER BY Year ASC;
POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS
Current and following rows
Query Result

WITH Russia_Medals AS (... ) | Year | Medals | Max_Medals_Next |


|------|--------|----------------- |
SELECT | 1996 | 36 | 66 |
Year, Medals, | 2000 | 66 | 66 |
MAX(Medals) | 2004 | 47 | 47 |
OVER (ORDER BY Year ASC | 2008 | 43 | 47 |
ROWS BETWEEN | 2012 | 47 | 47 |
CURRENT ROW AND 1 FOLLOWING)
AS Max_Medals_Next
FROM Russia_Medals
ORDER BY Year ASC;

POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS


Let's practice!
POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS
Moving averages
and totals
POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS

Michel Semaan
Moving averages
Overview
Moving average (MA): Average of last n periods
Example: 10-day MA of units sold in sales is the average of the last 10 days' sold units
Used to indicate momentum/trends
Also useful in eliminating seasonality
Moving total: Sum of last n periods
Example: Sum of the last 3 Olympic games' medals
Used to indicate performance; if the sum is going down, overall performance is going
down

POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS


Source table
Query Result

SELECT | Year | Medals |


Year, COUNT(*) AS Medals |------|--------|
FROM Summer_Medals | 1984 | 168 |
WHERE | 1988 | 77 |
Country = 'USA' | 1992 | 89 |
AND Medal = 'Gold' | 1996 | 160 |
AND Year >= 1980 | 2000 | 130 |
GROUP BY Year | 2004 | 116 |
ORDER BY Year ASC; | 2008 | 125 |
| 2012 | 147 |

POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS


Moving average
Query Result

WITH US_Medals AS (...) | Year | Medals | Medals_MA |


|------|--------|-----------|
SELECT | 1984 | 168 | 168.00 |
Year, Medals, | 1988 | 77 | 122.50 |
AVG(Medals) OVER | 1992 | 89 | 111.33 |
(ORDER BY Year ASC | 1996 | 160 | 108.67 |
ROWS BETWEEN | 2000 | 130 | 126.33 |
2 PRECEDING AND CURRENT ROW) AS Medals_MA | 2004 | 116 | 135.33 |
FROM US_Medals | 2008 | 125 | 123.67 |
ORDER BY Year ASC; | 2012 | 147 | 129.33 |

POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS


Moving total
Query Result

WITH US_Medals AS (...) | Year | Medals | Medals_MT |


|------|--------|-----------|
SELECT | 1984 | 168 | 168 |
Year, Medals, | 1988 | 77 | 245 |
SUM(Medals) OVER | 1992 | 89 | 334 |
(ORDER BY Year ASC | 1996 | 160 | 326 |
ROWS BETWEEN | 2000 | 130 | 379 |
2 PRECEDING AND CURRENT ROW) AS Medals_MT | 2004 | 116 | 406 |
FROM US_Medals | 2008 | 125 | 371 |
ORDER BY Year ASC; | 2012 | 147 | 388 |

POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS


ROWS vs RANGE
RANGE BETWEEN [START] AND [FINISH]
Functions much the same as ROWS BETWEEN
RANGE treats duplicates in OVER 's ORDER BY subclause as a single entity

Table

| Year | Medals | Rows_RT | Range_RT |


|------ |-------- |--------- |---------- |
| 1992 | 10 | 10 | 10 |
| 1996 | 50 | 60 | 110 |
| 2000 | 50 | 110 | 110 |
| 2004 | 60 | 170 | 230 |
| 2008 | 60 | 230 | 230 |
| 2012 | 70 | 300 | 300 |
ROWS BETWEEN is almost always used over RANGE BETWEEN

POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS


Let's practice!
POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS

You might also like