Open In App

PostgreSQL – CUME_DIST Function

Last Updated : 18 Jul, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

The PostgreSQL CUME_DIST() function is a powerful analytical tool used to determine the relative position of a value within a set of given values. This function helps compute the cumulative distribution of values in a result set, which can be particularly useful in statistical analysis and reporting.

Let us better understand the CUME_DIST Function in PostgreSQL from this article.

Syntax

CUME_DIST() OVER (
[PARTITION BY partition_expression, ... ]
ORDER BY sorting_expression [ASC | DESC], ...
)

Parameters:

  • PARTITION BY: This optional clause divides the result set into partitions where the function is applied. If not specified, PostgreSQL treats the entire result set as a single partition.
  • ORDER BY: This clause sorts the rows in each partition where the function is applied.

Return Value:

The CUME_DIST() function returns a double-precision value between 0 and 1:

0 < CUME_DIST() <= 1

PostgreSQL CUME_DIST Function Examples

Let us take a look at some of the examples of CUME_DIST Function in PostgreSQL to better understand the concept.

Example 1: Sales Percentile for 2018

First, create a table named ‘sales_stats’ that stores theĀ  sales revenue by employees:

PostgreSQL
CREATE TABLE sales_stats(
    name VARCHAR(100) NOT NULL,
    year SMALLINT NOT NULL CHECK (year > 0),
    amount DECIMAL(10, 2) CHECK (amount >= 0),
    PRIMARY KEY (name, year)
);
INSERT INTO 
    sales_stats(name, year, amount)
VALUES
    ('Raju kumar', 2018, 120000),
    ('Alibaba', 2018, 110000),
    ('Gabbar Singh', 2018, 150000),
    ('Kadar Khan', 2018, 30000),
    ('Amrish Puri', 2018, 200000),
    ('Raju kumar', 2019, 150000),
    ('Alibaba', 2019, 130000),
    ('Gabbar Singh', 2019, 180000),
    ('Kadar Khan', 2019, 25000),
    ('Amrish Puri', 2019, 270000);

The following query returns the sales amount percentile for each sales employee in 2018.

Query:

SELECT 
name,
year,
amount,
CUME_DIST() OVER (
ORDER BY amount
)
FROM
sales_stats
WHERE
year = 2018;

Output:

PostgreSQL CUME_DIST Function Example

Example 2: Sales Percentile for 2018 and 2019

The following query uses the CUME_DIST() function to calculate the sales percentile for each sales employee in 2018 and 2019.

Query:

SELECT 
name,
year,
amount,
CUME_DIST() OVER (
PARTITION BY year
ORDER BY amount
)
FROM
sales_stats;

Output:

PostgreSQL CUME_DIST Function Example

Important Points About PostgreSQL CUME_DIST Function

  • The CUME_DIST() function calculates the cumulative distribution of a value in a dataset.
  • When there are ties (duplicate values) in the ordering column, CUME_DIST() assigns the same cumulative distribution value to each tied row.
  • The result of CUME_DIST() is in double-precision. Be mindful of this when performing further calculations or comparisons with the result.
  • CUME_DIST() and PERCENT_RANK() both return values between 0 and 1, but they differ in calculation. CUME_DIST() shows the proportion of rows with values less than or equal to the current row, while PERCENT_RANK() indicates the relative rank of the current row within the partition.


Next Article

Similar Reads