0% found this document useful (0 votes)
7 views12 pages

Grouping and Aggregate 1674115851

The document provides an overview of SQL grouping and aggregate functions, detailing how to group data, apply aggregate functions like COUNT, SUM, AVG, MAX, and MIN, and filter results using HAVING clauses. It includes examples of single-column and multi-column grouping, as well as handling NULL values and counting distinct entries. Additionally, it discusses the importance of using GROUP BY clauses to specify how data should be aggregated and filtered effectively.

Uploaded by

Suhani Rana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views12 pages

Grouping and Aggregate 1674115851

The document provides an overview of SQL grouping and aggregate functions, detailing how to group data, apply aggregate functions like COUNT, SUM, AVG, MAX, and MIN, and filter results using HAVING clauses. It includes examples of single-column and multi-column grouping, as well as handling NULL values and counting distinct entries. Additionally, it discusses the importance of using GROUP BY clauses to specify how data should be aggregated and filtered effectively.

Uploaded by

Suhani Rana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Grouping and Aggregates

/*Grouping and Aggregates


Sometimes you will want to find trends in your data that will require
the database
server to cook the data a bit before you can generate the results you
are looking for.*/

SELECT
customer_id
FROM
rental;

SELECT
customer_id
FROM
rental
GROUP BY customer_id;

/*To see how many films each customer rented, you can use an aggregate
function in the select clause to count the number of rows in each
group*/

SELECT
customer_id, COUNT(*) AS number_of_films
FROM
rental
GROUP BY customer_id;

https://www.linkedin.com/in/pradeepchandra-reddy-s-c/
/*In order to determine which customers have rented the most films,
simply add an order by clause*/

SELECT
customer_id, COUNT(*) AS number_of_films
FROM
rental
GROUP BY customer_id
ORDER BY 2 DESC;

/*When grouping data, you may need to filter out undesired data from
your result set based on groups of data rather than based on the raw
data. Since the group by clause runs after the where clause has been
evaluated, you cannot add filter conditions to your where clause for
this purpose. you must put your group filter conditions in the having
clause*/

SELECT
customer_id, COUNT(*)
FROM
rental
GROUP BY customer_id
HAVING COUNT(*) >= 35;

SELECT
customer_id, COUNT(*)
FROM
rental
GROUP BY customer_id
HAVING COUNT(*) >= 35
ORDER BY 2 DESC;

https://www.linkedin.com/in/pradeepchandra-reddy-s-c/
/*Aggregate functions perform a specific operation over all
rows in a group.
max()
Returns the maximum value within a set
min()
Returns the minimum value within a set
avg()
Returns the average value across a set
sum()
Returns the sum of the values across a set
count()
Returns the number of values in a set*/

SELECT
MAX(amount) AS max_amount
FROM
payment;

SELECT
MIN(amount) AS min_amount
FROM
payment;

SELECT
AVG(amount) AS avg_amount,
SUM(amount) AS total_amount,
COUNT(*) num_of_payments
FROM
payment;

https://www.linkedin.com/in/pradeepchandra-reddy-s-c/
SELECT
customer_id,
MAX(amount) max_amt,
MIN(amount) min_amt,
AVG(amount) avg_amt,
SUM(amount) tot_amt,
COUNT(*) num_payments
FROM
payment;

/*While it may be obvious to you that you want the aggregate functions
applied to each customer found in the payment table, this query fails
because you have not explicitly specified how the data should be
grouped. Therefore, you will need to add a group by clause to specify
over which group of rows the aggregate functions should be applied*/

SELECT
customer_id,
MAX(amount) max_amt,
MIN(amount) min_amt,
AVG(amount) avg_amt,
SUM(amount) tot_amt,
COUNT(*) num_payments
FROM
payment
GROUP BY customer_id;

/*With the inclusion of the group by clause, the server knows to group
together rows having the same value in the customer_id column first
and then to apply the five aggregate functions to each of the 599
groups. */

https://www.linkedin.com/in/pradeepchandra-reddy-s-c/
/*Counting Distinct Values*/

SELECT
COUNT(customer_id) num_rows,
COUNT(DISTINCT customer_id) num_customers
FROM
payment;

/*Using Expressions
Along with using columns as arguments to aggregate functions, you can
use expressions as well.*/

SELECT
MAX(DATEDIFF(return_date, rental_date))
FROM
rental;

https://www.linkedin.com/in/pradeepchandra-reddy-s-c/
/*How Nulls Are Handled while grouping*/

CREATE TABLE number_tbl (


val SMALLINT
);

INSERT INTO number_tbl VALUES (1);


INSERT INTO number_tbl VALUES (3);
INSERT INTO number_tbl VALUES (5);

SELECT
COUNT(*) num_rows,
COUNT(val) num_vals,
SUM(val) total,
MAX(val) max_val,
AVG(val) avg_val
FROM
number_tbl;

/*The results are as you would expect: both count(*) and count(val)
return the value 4, sum(val) returns the value 10, max(val) returns
5, and avg(val) returns 2.5.*/
INSERT INTO number_tbl VALUES (NULL);
SELECT
COUNT(*) num_rows,
COUNT(val) num_vals,
SUM(val) total,
MAX(val) max_val,
AVG(val) avg_val
FROM
number_tbl;

https://www.linkedin.com/in/pradeepchandra-reddy-s-c/
/*Even with the addition of the null value to the table, the sum(),
max(), and avg() functions all return the same values, indicating that
they ignore any null values encountered. The count(*) function now
returns the value 5, which is valid since the number_tbl table contains
four rows, while the count(val) function still returns the value 4.
The difference is that count(*) counts the number of rows, whereas
count(val) counts the number of values contained in the val column
and ignores any null values encountered.*/

/*Single-Column Grouping*/

SELECT
actor_id, COUNT(*) as num_of_films
FROM
film_actor
GROUP BY actor_id;

/*Multicolumn Grouping
Expanding on the previous example, imagine that you want to find the
total number
of films for each film rating (G, PG, ...) for each actor.*/

SELECT
fa.actor_id, f.rating, COUNT(*) AS num_of_movies
FROM
film_actor AS fa
INNER JOIN
film AS f USING (film_id)
GROUP BY fa.actor_id , f.rating
ORDER BY 1 , 2;

-- or

https://www.linkedin.com/in/pradeepchandra-reddy-s-c/
SELECT
fa.actor_id, f.rating, COUNT(*) AS num_of_movies
FROM
film_actor AS fa
INNER JOIN
film AS f ON fa.film_id = f.film_id
GROUP BY fa.actor_id , f.rating
ORDER BY 1 , 2;

desc actor;

SELECT
fa.actor_id,
a.first_name,
a.last_name,
f.rating,
COUNT(*) AS num_of_movies
FROM
film_actor AS fa
INNER JOIN
film AS f ON fa.film_id = f.film_id
INNER JOIN
actor AS a USING (actor_id)
GROUP BY fa.actor_id , f.rating
ORDER BY 1 , 2;

https://www.linkedin.com/in/pradeepchandra-reddy-s-c/
/*Grouping via Expressions*/

SELECT
EXTRACT(YEAR FROM rental_date) AS year, COUNT(*) how_many
FROM
rental
GROUP BY EXTRACT(YEAR FROM rental_date);

/*Generating Rollups*/

SELECT
fa.actor_id, f.rating, COUNT(*)
FROM
film_actor fa
INNER JOIN
film f ON fa.film_id = f.film_id
GROUP BY fa.actor_id , f.rating WITH ROLLUP
ORDER BY 1 , 2;

https://www.linkedin.com/in/pradeepchandra-reddy-s-c/
/*Group Filter Conditions
When grouping data, you also can apply filter conditions to the data
after the groups have been generated. The having clause is where you
should place these types of filter conditions.*/

SELECT
fa.actor_id, f.rating, COUNT(*) AS num_of_films
FROM
film_actor AS fa
INNER JOIN
film AS f USING (film_id)
WHERE
f.rating IN ('G' , 'PG')
GROUP BY 1 , 2;

SELECT
fa.actor_id, f.rating, COUNT(*) AS num_of_films
FROM
film_actor AS fa
INNER JOIN
film AS f USING (film_id)
WHERE
f.rating IN ('G' , 'PG')
GROUP BY 1 , 2
HAVING num_of_films > 10;

/*This query has two filter conditions: one in the where clause, which
filters out any films rated something other than G or PG, and another
in the having clause, which filters out any actors who appeared in
more than 10 films. Thus, one of the filters acts on data before it
is grouped, and the other filter acts on data after the groups have
been created.*/

https://www.linkedin.com/in/pradeepchandra-reddy-s-c/
/*Exercise - 1
Construct a query that counts the number of rows in the payment
table.*/

desc payment;

SELECT
COUNT(*) AS number_of_rows
FROM
payment;

/*Exercise - 2
Modify your query from Exercise - 1 to count the number of payments
made by each
customer. Show the customer ID and the total amount paid for each
customer.*/

SELECT
customer_id, COUNT(payment_id) AS num_of_payment, sum(amount) as
total_paid
FROM
payment
GROUP BY customer_id;

SELECT
customer_id, COUNT(payment_id) AS num_of_payments, sum(amount) as
total_paid
FROM
payment
GROUP BY customer_id
ORDER BY 3 DESC;

https://www.linkedin.com/in/pradeepchandra-reddy-s-c/
/*Exercise - 3
Modify your query from Exercise 8-2 to include only those customers
who have
made at least 40 payments.*/

SELECT
customer_id, COUNT(payment_id) AS num_of_payments, sum(amount) as
total_paid
FROM
payment
GROUP BY customer_id
HAVING num_of_payments >= 40
ORDER BY 3 DESC;

https://www.linkedin.com/in/pradeepchandra-reddy-s-c/

You might also like