Summchpt 1
Summchpt 1
Michel Semaan
Data Scientist
Motivation
USA total and running total of Summer Discus throw reigning champion status
Olympics gold medals since 2004
| Year | Champion | Last_Champion | Reigning_Champion |
|------ |----------|--------------- |------------------- |
| Year | Medals | Medals_RT |
| 1996 | GER | null | false |
|------|--------|----------- |
| 2000 | LTU | GER | false |
| 2004 | 116 | 116 |
| 2004 | LTU | LTU | true |
| 2008 | 125 | 241 |
| 2008 | EST | LTU | false |
| 2012 | 147 | 388 |
| 2012 | GER | EST | false |
POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS
Course outline
1. Introduction to window functions
2. Fetching, ranking, and paging
3. Aggregate window functions and frames
4. Beyond window functions
Columns
Year , City
Medal
Uses
Fetching values from preceding or following rows (e.g. fetching the previous row's value)
Determining reigning champion status
Calculating growth over time
Assigning ordinal ranks (1st, 2nd, etc.) to rows based on their values' positions in a sorted list
SELECT
Year, Event, Country,
ROW_NUMBER() OVER () AS Row_N
FROM Summer_Medals
WHERE
Medal = 'Gold';
PARTITION BY
ROWS/RANGE PRECEDING/FOLLOWING/UNBOUNDED
Michel Semaan
Data Scientist
Row numbers
Query Result*
LAG(column, n) OVER (...) returns column 's value at the row n rows before the current row
SELECT
Year, Champion,
LAG(Champion, 1) OVER
(ORDER BY Year ASC) AS Last_Champion
FROM Discus_Gold
ORDER BY Year ASC;
Michel Semaan
Data Scientist
Motivation
Query Result
SELECT
When Event changes from Discus Throw
Year, Event, Champion, to Triple Jump , LAG fetched
LAG(Champion) OVER
(ORDER BY Event ASC, Year ASC) AS Last_Champion
Discus Throw 's last champion as opposed
FROM Discus_Gold to a null
ORDER BY Event ASC, Year ASC;
POSTGRESQL SUMMARY STATS AND WINDOW FUNCTIONS
Enter PARTITION BY
PARTITION BY splits the table into partitions based on a column's unique values
The results aren't rolled into one column
Operated on separately by the window function
ROW_NUMBER will reset for each partition
LAG will only fetch a row's previous value if its previous row is in the same partition