Sorting and Grouping Data
Sorting and Grouping Data
de Castro
Grouping data is the process of combining
columns with duplicate values in a logical
order.
SELECT
FROM
WHERE
GROUP BY
ORDER BY
SELECT statement’s syntax, including the
GROUP BY clause:
CITY
-------------
GREENWOOD
INDIANAPOLIS
WHITELAND
INDIANAPOLIS
INDIANAPOLIS
INDIANAPOLIS
6 rows selected.
Select the city and a count of all records for each city.
You see a count on each of the three distinct cities
because you are using a GROUPBY clause:
SELECT CITY, COUNT(*)
FROM EMPLOYEE_TBL
GROUP BY CITY;
CITY COUNT(*)
-------------- --------
GREENWOOD 1
INDIANAPOLIS 4
WHITELAND 1
3 rows selected.
A query from a temporary table
SELECT *
FROM EMP_PAY_TMP;
6 rows selected.
Click
Retrieve the average pay rate and salary on each
distinct city using the aggregate function AVG.
3 rows selected.
SELECT CITY, AVG(PAY_RATE), AVG(SALARY)
FROM EMP_PAY_TMP
WHERE CITY IN (‘INDIANAPOLIS’,’WHITELAND’)
GROUP BY CITY
ORDER BY 2,3;
CITY AVG(PAY_RATE)
AVG(SALARY)
------------ ------------- -----------
INDIANAPOLIS 13.5833333 20000
WHITELAND 40000
Shows the use of the MAX and MIN aggregate
functions with the GROUP BY clause
SELECT CITY, MAX(PAY_RATE), MIN(SALARY)
FROM EMP_PAY_TMP
GROUP BY CITY;
3 rows selected.
Like the ORDER BY clause, the GROUP BY clause can be ordered
by using an integer to represent the column name. The following is
an example of representing column names with numbers:
SELECT YEAR(DATE_HIRE) as YEAR_HIRED, SUM(SALARY)
FROM EMPLOYEE_PAY_TBL
GROUP BY 1;
YEAR_HIRED SUM(SALARY)
------------- ------------------------
1989 40000.00
1990
1991
1994 30000.00
1996
1997 20000.00
6 rows selected.
GROUP BY clause works the same as the
ORDER BY clause in that both are used to sort
data. The ORDER BY clause is specifically used
to sort data from a query. The GROUP BY
clause also sorts data from a query to properly
group the data. Therefore, the GROUP BY
clause can be used to sort data the same as the
ORDER BY clause.
SELECT statement to the GROUP BY clause:
SELECT LAST_NAME, FIRST_NAME, CITY
FROM EMPLOYEE_TBL
GROUP BY LAST_NAME, FIRST_NAME, CITY;
6 rows selected.
Shows a SELECT statement from EMPLOYEE_TBL
and uses the GROUP BY clause to order by CITY:
SELECT CITY, LAST_NAME
FROM EMPLOYEE_TBL
GROUP BY CITY, LAST_NAME;
CITY LAST_NAME
------------ ---------
GREENWOOD STEPHENS
INDIANAPOLIS GLASS
INDIANAPOLIS PLEW
INDIANAPOLIS SPURGEON
INDIANAPOLIS WALLACE
WHITELAND GLASS
6 rows selected.
All employee records in the EMPLOYEE_TBL table
are now counted, and the results are grouped by
CITY, but ordered by the count on each city first:
SELECT CITY, COUNT(*)
FROM EMPLOYEE_TBL
GROUP BY CITY
ORDER BY 2,1;
CITY COUNT(*)
----------- -----------
GREENWOOD 1
WHITELAND 1
INDIANAPOLIS 4
Although GROUP BY and ORDER BY perform a
similar function, there is one major difference.
The GROUP BY clause is designed to group
identical data, whereas the ORDER BY clause is
designed merely to put data into a specific
order. GROUP BY and ORDER BY can be used
in the same SELECT statement, but must
follow a specific order. The GROUP BY clause is
always placed before the ORDER BY clause in
the SELECT statement.
When used in conjunction with the GROUP BY
clause in a SELECT statement, tells GROUP BY
which groups to include in the output. HAVING is
to GROUP BY as WHERE is to SELECT. In other
words, the WHERE clause places conditions on
the selected columns, and the HAVING clause
places conditions on groups created by the
GROUP BY clause.
The following is the position of the HAVING
clause in a query:
SELECT
FROM
WHERE
GROUP BY
HAVING
ORDER BY
The following is the syntax of the SELECT
statement, including the HAVING clause:
SELECT COLUMN1, COLUMN2
FROM TABLE1, TABLE2
WHERE CONDITIONS
GROUP BY COLUMN1, COLUMN2
HAVING CONDITIONS
ORDER BY COLUMN1, COLUMN2
In the following example, you select the average pay rate and
salary for all cities except GREENWOOD. You group the output by
CITY, but only want to display those groups (cities) that have an
average salary greater than $20,000. You sort the results by
average salary for each city:
SELECT CITY, AVG(PAY_RATE), AVG(SALARY)
FROM EMP_PAY_TMP
WHERE CITY <> ‘GREENWOOD’
GROUP BY CITY
HAVING AVG(SALARY) > 20000
ORDER BY 3;
CITY AVG(PAY_RATE) AVG(SALARY)
------------ ------------- -----------
WHITELAND 40000
Result:
'EastBoston'
Result:
'East Boston'
This SQL Server statement concatenates the last
name with the first name and inserts a comma
between the two original values.
SELECT LAST_NAME || ‘, ‘ || FIRST_NAME NAME
FROM EMPLOYEE_TBL;
NAME
-----------------
STEPHENS, TINA
PLEW, LINDA
GLASS, BRANDON
GLASS, JACOB
WALLACE, MARIAH
SPURGEON, TIFFANY
6 rows selected.
…searches a string of characters and checks for a
specific character, makes note of the position
found, searches the replacement string at the
same position, and then replaces that character
with the new value. The syntax is
CITY CITY_TRANSLATION
------------ ------------
GREENWOOD GREEBWOOC
INDIANAPOLIS ABCAABAPOLAS
WHITELAND WHATELABC
INDIANAPOLIS ABCAABAPOLAS
INDIANAPOLIS ABCAABAPOLAS
INDIANAPOLIS ABCAABAPOLAS
6 rows selected.
…is used to replace every occurrence of a
character(s) with a specified character(s). The
use of this function is similar to the
TRANSLATE function, except only one specific
character or string is replaced within another
string.
This statement returns all of the cities in the employee
table and the same cities
with each I replaced with a Z:
SELECT CITY, REPLACE(CITY,’I’,’Z’)
FROM EMPLOYEE_TBL;
CITY REPLACE(CITY)
------------ -------------
GREENWOOD GREENWOOD
INDIANAPOLIS ZNDZANAPOLZS
WHITELAND WHZTELAND
INDIANAPOLIS ZNDZANAPOLZS
INDIANAPOLIS ZNDZANAPOLZS
INDIANAPOLIS ZNDZANAPOLZS
6 rows selected.
The UPPER function is used to convert lowercase
letters to uppercase letters for a specific string.
UPPER(character string)
This SQL statement converts all characters in the
column to uppercase:
SELECT UPPER(CITY)
FROM EMPLOYEE_TBL;
UPPER(CITY)
-------------
GREENWOOD
INDIANAPOLIS
WHITELAND
INDIANAPOLIS
INDIANAPOLIS
INDIANAPOLIS
6 rows selected.
…is used to convert uppercase letters to
lowercase letters for a specific string.
LOWER(character string)
This SQL statement converts all characters in the
column to lowercase:
SELECT LOWER(CITY)
FROM EMPLOYEE_TBL;
LOWER(CITY)
---------------
greenwood
indianapolis
whiteland
indianapolis
indianapolis
indianapolis
6 rows selected.
The syntax for SQL Server is
SUBSTRING(COLUMN NAME, STARTING POSITION, LENGTH)
Click
SELECT PROD_DESC,
INSTR(PROD_DESC,’A’,1,1)
FROM PRODUCTS_TBL;
PROD_DESC INSTR(PROD_DESC,’A’,1,1)
----------------------------------- ------------------------------------
WITCHES COSTUME 0
PLASTIC PUMPKIN 18 INCH 3
FALSE PARAFFIN TEETH 2
LIGHTED LANTERNS 10
ASSORTED COSTUMES 1
CANDY CORN 2
PUMPKIN CANDY 10
PLASTIC SPIDERS 3
ASSORTED MASKS 1
KEY CHAIN 7
OAK BOOKSHELF 2
11 rows selected.
LTRIM
is used to trim characters from the left of a
string. The syntax is
The syntax is
DECODE(COLUMN NAME, ‘SEARCH1’, ‘RETURN1’,[ ‘SEARCH2’,
‘RETURN2’, ‘DEFAULT VALUE’])
In the following example, DECODE is used on the values for CITY in EMPLOYEE_TBL:
SELECT CITY,
DECODE(CITY,’INDIANAPOLIS’,’INDY’,
‘GREENWOOD’,’GREEN’,’OTHER’)
FROM EMPLOYEE_TBL;
CITY DECOD
------------------ ---------------------
GREENWOOD GREEN
INDIANAPOLIS INDY
WHITELAND OTHER
INDIANAPOLIS INDY
INDIANAPOLIS INDY
INDIANAPOLIS INDY
6 rows selected.
LENGTH
- is a common function used to find the length of
a string, number, date, or expression in bytes.
The syntax is
LENGTH(CHARACTER STRING)
This SQL statement returns the product
description and also its corresponding length:
Click
SELECT PROD_DESC, LENGTH(PROD_DESC)
FROM PRODUCTS_TBL;
PROD_DESC LENGTH(PROD_DESC)
--------------------------- --------------------------------
WITCHES COSTUME 15
PLASTIC PUMPKIN 18 INCH 23
FALSE PARAFFIN TEETH 19
LIGHTED LANTERNS 16
ASSORTED COSTUMES 17
CANDY CORN 10
PUMPKIN CANDY 13
PLASTIC SPIDERS 15
ASSORTED MASKS 14
KEY CHAIN 9
OAK BOOKSHELF 13
11 rows selected.
… is used to return data from one expression if
another expression is NULL. IFNULL can be used
with most data types; however, the value and the
substitute must be the same data type.
The syntax is
IFNULL(‘VALUE’, ‘SUBSTITUTION’)
This SQL statement finds NULL values and substitutes
9999999999 for any NULL values:
SELECT PAGER, IFNULL(PAGER,9999999999)
FROM EMPLOYEE_TBL;
PAGER IFNULL(PAGER,
-------------------- --------------------------
9999999999
9999999999
3175709980 3175709980
8887345678 8887345678
9999999999
9999999999
6 rows selected.
… is similar to the IFNULL function in that it is used
to specifically replace NULL values within the
result set. The COALESCE function, however, can
accept a whole set of values and checks each one
in order until it finds a non-NULL result. If a non-
NULL result is not present, COALESCE returns a
NULL value.
The following example demonstrates the COALESCE function by
giving us the first non-NULL value of BONUS, SALARY, and
PAY_RATE:
SELECT EMP_ID, COALESCE(BONUS,SALARY,PAY_RATE)
FROM EMPLOYEE_PAY_TBL;
EMP_ID COALESCE(BONUS,SALARY,PAY_RATE)
-------------------------- ---------------------------------------------------
213764555 2000.00
220984332 11.00
311549902 40000.00
313782439 1000.00
442346889 14.75
443679012 15.00
6 rows selected.
LPAD (left pad) is used to add characters or spaces
to the left of a string. The syntax is
LPAD(CHARACTER SET)