Lab-12-Manual - (Reporting Aggregated Data Using GROUP BY)
Lab-12-Manual - (Reporting Aggregated Data Using GROUP BY)
Spring 2024
Introduction to Lab
This lab further addresses functions. It focuses on obtaining summary information, such
as averages, for group of rows. It discusses how to group rows in a table into smaller sets
and how to specify the search criteria for group of rows.
Page | 1
Lab-12 Manual 2024
MySQL
Unlike single-row functions, group functions operate on sets of rows to give one result
per group. These sets may be the whole table or the table split into groups.
For example, what is the maximum salary in the employees table?
AVG
COUNT
MAX MIN
STDDEV
SUM
VARIANCE
Each of the functions accepts an
argument. The following table identifies the options that you can use in the syntax:
Function Description
Page | 2
Lab-12 Manual 2024
• All group functions ignore null values. To substitute a value for null values, use the
IFNULL or COALESCE functions.
Page | 3
Lab-12 Manual 2024
The above query displays the most junior and most senior employees.
The above query displays the employee name that is first and the employee name
that is last in an alphabetized list of all employees.
Note: The AVG, SUM, VARIANCE, and STDDEV functions can be used only with
numeric data types.
COUNT (*)
COUNT (expr)
COUNT (*) returns the number of rows in a table that satisfy the criteria of the SELECT
statement, including duplicate rows and rows containing null values in any of the
columns. If a WHERE clause is included in the SELECT statement, COUNT (*) returns the
number of rows that satisfy the condition in the WHERE clause.
In contrast, COUNT (expr)returns the number of non-null values in the column identified
by expr.
COUNT (DISTINCT expr) returns the number of unique, non-null values in the column
identified by expr.
Examples:
Page | 4
Lab-12 Manual 2024
1.
2. SELECT COUNT(*) FROM emp;
3. SELECT COUNT(*) FROM emp WHERE deptno = 10;
4. SELECT COUNT(comm) FROM emp;
5. SELECT COUNT(comm) FROM emp WHERE deptno = 30;
6. SELECT COUNT(deptno) FROM emp;
7. SELECT COUNT(DISTINCT deptno) FROM emp;
In the above query, the average is calculated based only on the rows in the table where a valid
value is stored in the COMM column. The average is calculated as the total commission paid to
all employees divided by the number of employees receiving commission.
The average is calculated based on all rows in the table, regardless of whether null values
are stored in the COMM column. The average is calculated as the total commission that is
paid to all employees divided by the total number of employees in the company.
Page | 5
Lab-12 Manual 2024
Until now, all group functions have treated the table as one large group of information. At
times, you need to divide the table of information into smaller groups. For example, display
the average salary in EMP table for each department.
You can use the GROUP BY clause to divide the rows in a table into groups. You can then
use the group functions to return summary information for each group.
In the syntax, group_by_expression specifies columns whose values determine the basis
for grouping rows.
Guidelines:
If you include a group function in a SELECT clause, you cannot select individual
results, unless the individual column appears in the GROUP BY clause. You receive
an error message if you fail to include the column in the GROUP BY clause.
Using a WHERE clause, you can exclude rows before dividing them into groups.
You cannot use Group function in where clause.
You must include the non-aggregated columns in the GROUP BY clause. Otherwise
it will either produce error or incorrect result.
You should not use a column alias in the GROUP BY clause. But in some databases
it’s allowed as well as an extended feature.
By default, rows are sorted by ascending order of the columns included in the
GROUP BY list. You can override this by using the ORDER BY clause.
11. Using the GROUP BY Clause
When using the GROUP BY clause, make sure that all columns in the SELECT list that are
not group functions are included in the GROUP BY clause. For Example,
1. SELECT deptno, AVG(sal)FROM emp GROUP BY deptno;
2. SELECT deptno, COUNT(*) FROM emp WHERE sal > 1200 GROUP BY deptno
3. SELECT AVG(sal)FROM emp GROUP BY deptno;
Page | 6
Lab-12 Manual 2024
4. SELECT deptno, COUNT(*) FROM emp WHERE sal > 1200 GROUP BY deptno ORDER BY
deptno DESC;
The GROUP BY column does not have to be in the SELECT list as in query-3 above.
Sometimes you need to see results for groups within groups. For example, display the
results by adding up salaries in the EMP table for each job, grouped by department.
Note: In the GROUP BY clause, the order of columns does matter in terms of how the grouping is
performed, but it doesn’t affect the final result set in terms of distinct groups created.
1. Grouping Behavior:
a) The GROUP BY clause defines how rows are grouped into summary rows.
b) Columns listed first in the GROUP BY clause are grouped first, and then within those
groups, further grouping is done by the subsequent columns.
2. Impact on Aggregation:
a) The order of columns in GROUP BY affects the hierarchy of grouping. For example,
grouping by deptno first and then by job means you first group the rows by deptno, and within
each deptno group, you further group by job.
Any column or expression in the SELECT list that is not an aggregate function must be in
a GROUP BY clause. Failure to do so will result in an incorrect result. For example,
SELECT deptno, COUNT(ename) FROM emp;
Page | 7
Lab-12 Manual 2024
Also, the WHERE clause cannot be used to restrict groups. For example, the following query will
result in an error:
SELECT deptno, AVG(sal)
FROM emp
WHERE AVG(sal) > 1500
GROUP BY deptno;
You can correct this error by using the HAVING clause to restrict groups:
SELECT deptno, AVG(sal)
FROM emp
GROUP BY deptno;
HAVING AVG(sal) > 1500
In the same way you use the WHERE clause to restrict rows that you select; you use the
HAVING to restrict groups. For example, find the maximum salary per department when
it is greater than $2000.
By using the HAVING clause, we can restrict the groups on the basis of aggregate
function. In the syntax, the group_condition is used to restrict the group of rows returned
to those groups for which the specified condition is true.
Page | 8
Lab-12 Manual 2024
Note: You can use a GROUP BY clause without using a group function in the SELECT list. But
the down side is you wouldn’t know how your groups are formed.
For example,
SELECT AVG(sal)
FROM emp
GROUP BY deptno
HAVING MAX(sal) > 2000;
And also,
SELECT job, SUM(sal) PAYROLL
FROM emp
WHERE job NOT LIKE ‘%MAN%’
GROUP BY job
HAVING SUM(sal) >= 5000
ORDER BY SUM(sal);
1. Display the highest, lowest, sum, and average salary of all employees. Label the
columns Maximum, Minimum, Sum, and Average respectively. Round your results to
the nearest whole number.
2. Modify the above query to display the minimum, maximum, sum, and average salary
for each job type.
Page | 9
Lab-12 Manual 2024
3. Write a query to display the number of people with the same job.
4. Determine the number of managers without listing them. Label the column Number
of Managers. Hint: Use the MGR column to determine the number of managers.
5. Write a query that displays the difference between the highest and lowest salaries.
Label the column DIFFERENCE.
6. Display the manager id and the salary of the lowest paid employee for that manager.
Exclude anyone whose manager is not known. Exclude any groups where the
minimum salary is less than $1500. Sort the output in descending order of minimum
salary.
7. Write a query to display each department’s name, location, number of employees, and
the average salary of all employees of that department. Label the columns Name,
Location, Number of People, and Salary respectively.
The End
Page | 10