0% found this document useful (0 votes)
42 views

Sorting and Grouping Data

The document discusses various SQL functions and clauses used for string manipulation and grouping data, including: - CONCATENATE combines two strings into one - TRANSLATE searches and replaces characters in a string - REPLACE replaces all occurrences of one character with another in a string - GROUP BY groups identical data and is used with aggregate functions like COUNT, AVG, MAX, MIN - HAVING filters groups created by GROUP BY
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views

Sorting and Grouping Data

The document discusses various SQL functions and clauses used for string manipulation and grouping data, including: - CONCATENATE combines two strings into one - TRANSLATE searches and replaces characters in a string - REPLACE replaces all occurrences of one character with another in a string - GROUP BY groups identical data and is used with aggregate functions like COUNT, AVG, MAX, MIN - HAVING filters groups created by GROUP BY
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 60

Vrigin Kathleen A.

de Castro
 Grouping data is the process of combining
columns with duplicate values in a logical
order.

 Grouping data is accomplished through the


use of the GROUP BY clause of a SELECT
statement (query).
 is used in collaboration with the SELECT
statement to arrange identical data into
groups

 follows the WHERE clause in a SELECT


statement and precedes the ORDER BY
clause
The position of the GROUP BY clause in a query
is as follows:

SELECT
FROM
WHERE
GROUP BY
ORDER BY
SELECT statement’s syntax, including the
GROUP BY clause:

SELECT COLUMN1, COLUMN2


FROM TABLE1, TABLE2
WHERE CONDITIONS
GROUP BY COLUMN1, COLUMN2
ORDER BY COLUMN1, COLUMN2
You can see that there are three distinct cities
in the EMPLOYEE_TBL table:
SELECT CITY
FROM EMPLOYEE_TBL;

CITY
-------------
GREENWOOD
INDIANAPOLIS
WHITELAND
INDIANAPOLIS
INDIANAPOLIS
INDIANAPOLIS

6 rows selected.
Select the city and a count of all records for each city.
You see a count on each of the three distinct cities
because you are using a GROUPBY clause:
SELECT CITY, COUNT(*)
FROM EMPLOYEE_TBL
GROUP BY CITY;

CITY COUNT(*)
-------------- --------
GREENWOOD 1
INDIANAPOLIS 4
WHITELAND 1

3 rows selected.
A query from a temporary table
SELECT *
FROM EMP_PAY_TMP;

CITY LAST_NAM FIRST_NA PAY_RATE SALARY


------------ -------- ---------- - ----------- ------
GREENWOOD STEPHENS TINA 30000
INDIANAPOLIS PLEW LINDA 14.75
WHITELAND GLASS BRANDON 40000
INDIANAPOLIS GLASS JACOB 20000
INDIANAPOLIS WALLACE MARIAH 11
INDIANAPOLIS SPURGEON TIFFANY 15

6 rows selected.

Click
Retrieve the average pay rate and salary on each
distinct city using the aggregate function AVG.

SELECT CITY, AVG(PAY_RATE), AVG(SALARY)


FROM EMP_PAY_TMP
GROUP BY CITY;

CITY AVG(PAY_RATE) AVG(SALARY)


------------ ------------- -----------
GREENWOOD 30000
INDIANAPOLIS 13.5833333 20000
WHITELAND 40000

3 rows selected.
SELECT CITY, AVG(PAY_RATE), AVG(SALARY)
FROM EMP_PAY_TMP
WHERE CITY IN (‘INDIANAPOLIS’,’WHITELAND’)
GROUP BY CITY
ORDER BY 2,3;

CITY AVG(PAY_RATE)
AVG(SALARY)
------------ ------------- -----------
INDIANAPOLIS 13.5833333 20000
WHITELAND 40000
Shows the use of the MAX and MIN aggregate
functions with the GROUP BY clause
SELECT CITY, MAX(PAY_RATE), MIN(SALARY)
FROM EMP_PAY_TMP
GROUP BY CITY;

CITY MAX(PAY_RATE) MIN(SALARY)


------------ ------------- -----------
GREENWOOD 30000
INDIANAPOLIS 15 20000
WHITELAND 40000

3 rows selected.
Like the ORDER BY clause, the GROUP BY clause can be ordered
by using an integer to represent the column name. The following is
an example of representing column names with numbers:
SELECT YEAR(DATE_HIRE) as YEAR_HIRED, SUM(SALARY)
FROM EMPLOYEE_PAY_TBL
GROUP BY 1;

YEAR_HIRED SUM(SALARY)
------------- ------------------------
1989 40000.00
1990
1991
1994 30000.00
1996
1997 20000.00

6 rows selected.
GROUP BY clause works the same as the
ORDER BY clause in that both are used to sort
data. The ORDER BY clause is specifically used
to sort data from a query. The GROUP BY
clause also sorts data from a query to properly
group the data. Therefore, the GROUP BY
clause can be used to sort data the same as the
ORDER BY clause.
SELECT statement to the GROUP BY clause:
SELECT LAST_NAME, FIRST_NAME, CITY
FROM EMPLOYEE_TBL
GROUP BY LAST_NAME, FIRST_NAME, CITY;

LAST_NAME FIRST_NAME CITY


-------- ----------- ------------
GLASS BRANDON WHITELAND
GLASS JACOB INDIANAPOLIS
PLEW LINDA INDIANAPOLIS
SPURGEON TIFFANY INDIANAPOLIS
STEPHENS TINA GREENWOOD
WALLACE MARIAH INDIANAPOLIS

6 rows selected.
Shows a SELECT statement from EMPLOYEE_TBL
and uses the GROUP BY clause to order by CITY:
SELECT CITY, LAST_NAME
FROM EMPLOYEE_TBL
GROUP BY CITY, LAST_NAME;

CITY LAST_NAME
------------ ---------
GREENWOOD STEPHENS
INDIANAPOLIS GLASS
INDIANAPOLIS PLEW
INDIANAPOLIS SPURGEON
INDIANAPOLIS WALLACE
WHITELAND GLASS

6 rows selected.
All employee records in the EMPLOYEE_TBL table
are now counted, and the results are grouped by
CITY, but ordered by the count on each city first:
SELECT CITY, COUNT(*)
FROM EMPLOYEE_TBL
GROUP BY CITY
ORDER BY 2,1;

CITY COUNT(*)
----------- -----------
GREENWOOD 1
WHITELAND 1
INDIANAPOLIS 4
Although GROUP BY and ORDER BY perform a
similar function, there is one major difference.
The GROUP BY clause is designed to group
identical data, whereas the ORDER BY clause is
designed merely to put data into a specific
order. GROUP BY and ORDER BY can be used
in the same SELECT statement, but must
follow a specific order. The GROUP BY clause is
always placed before the ORDER BY clause in
the SELECT statement.
When used in conjunction with the GROUP BY
clause in a SELECT statement, tells GROUP BY
which groups to include in the output. HAVING is
to GROUP BY as WHERE is to SELECT. In other
words, the WHERE clause places conditions on
the selected columns, and the HAVING clause
places conditions on groups created by the
GROUP BY clause.
The following is the position of the HAVING
clause in a query:

SELECT
FROM
WHERE
GROUP BY
HAVING
ORDER BY
The following is the syntax of the SELECT
statement, including the HAVING clause:
SELECT COLUMN1, COLUMN2
FROM TABLE1, TABLE2
WHERE CONDITIONS
GROUP BY COLUMN1, COLUMN2
HAVING CONDITIONS
ORDER BY COLUMN1, COLUMN2
In the following example, you select the average pay rate and
salary for all cities except GREENWOOD. You group the output by
CITY, but only want to display those groups (cities) that have an
average salary greater than $20,000. You sort the results by
average salary for each city:
SELECT CITY, AVG(PAY_RATE), AVG(SALARY)
FROM EMP_PAY_TMP
WHERE CITY <> ‘GREENWOOD’
GROUP BY CITY
HAVING AVG(SALARY) > 20000
ORDER BY 3;
CITY AVG(PAY_RATE) AVG(SALARY)
------------ ------------- -----------
WHITELAND 40000

1 row selected. Click


…is the process of combining two separate
strings into one string. For example, you might
want to concatenate an individual’s first and last
names into a single string for the complete
name. JOHN concatenated with SMITH produces
JOHN SMITH.
Example 1:
Table Geography MySQL/Oracle:
SELECT
CONCAT(region_name,store_name)
FROM Geography
WHERE store_name = 'Boston';

Result:
'EastBoston'

SELECT region_name + ' ' + store_name FROM Geography


WHERE store_name = 'Boston';

Result:
'East Boston'
This SQL Server statement concatenates the last
name with the first name and inserts a comma
between the two original values.
SELECT LAST_NAME || ‘, ‘ || FIRST_NAME NAME
FROM EMPLOYEE_TBL;

NAME
-----------------
STEPHENS, TINA
PLEW, LINDA
GLASS, BRANDON
GLASS, JACOB
WALLACE, MARIAH
SPURGEON, TIFFANY

6 rows selected.
…searches a string of characters and checks for a
specific character, makes note of the position
found, searches the replacement string at the
same position, and then replaces that character
with the new value. The syntax is

TRANSLATE(CHARACTER SET, VALUE1, VALUE2)


This SQL statement substitutes ever occurrence
of I in the string with A, every occurrence of N
with B, and replaces all occurrences of D with C.

SELECT TRANSLATE (CITY,’IND’,’ABC’ FROM EMPLOYEE_TBL)


CITY_TRANSLATION
SELECT CITY, TRANSLATE(CITY,’IND’,’ABC’)
FROM EMPLOYEE_TBL;

CITY CITY_TRANSLATION
------------ ------------
GREENWOOD GREEBWOOC
INDIANAPOLIS ABCAABAPOLAS
WHITELAND WHATELABC
INDIANAPOLIS ABCAABAPOLAS
INDIANAPOLIS ABCAABAPOLAS
INDIANAPOLIS ABCAABAPOLAS

6 rows selected.
…is used to replace every occurrence of a
character(s) with a specified character(s). The
use of this function is similar to the
TRANSLATE function, except only one specific
character or string is replaced within another
string.
This statement returns all of the cities in the employee
table and the same cities
with each I replaced with a Z:
SELECT CITY, REPLACE(CITY,’I’,’Z’)
FROM EMPLOYEE_TBL;

CITY REPLACE(CITY)
------------ -------------
GREENWOOD GREENWOOD
INDIANAPOLIS ZNDZANAPOLZS
WHITELAND WHZTELAND
INDIANAPOLIS ZNDZANAPOLZS
INDIANAPOLIS ZNDZANAPOLZS
INDIANAPOLIS ZNDZANAPOLZS

6 rows selected.
The UPPER function is used to convert lowercase
letters to uppercase letters for a specific string.

The syntax is as follows:

UPPER(character string)
This SQL statement converts all characters in the
column to uppercase:
SELECT UPPER(CITY)
FROM EMPLOYEE_TBL;

UPPER(CITY)
-------------
GREENWOOD
INDIANAPOLIS
WHITELAND
INDIANAPOLIS
INDIANAPOLIS
INDIANAPOLIS

6 rows selected.
…is used to convert uppercase letters to
lowercase letters for a specific string.

The syntax is as follows:

LOWER(character string)
This SQL statement converts all characters in the
column to lowercase:
SELECT LOWER(CITY)
FROM EMPLOYEE_TBL;
 
LOWER(CITY)
 ---------------
greenwood
indianapolis
whiteland
indianapolis
indianapolis
indianapolis
 
6 rows selected.
The syntax for SQL Server is
SUBSTRING(COLUMN NAME, STARTING POSITION, LENGTH)

The concept of substring is the capability to extract


part of a string, or a “sub” of the
string. For example, the following values are
substrings of JOHNSON:
 J
 JOHN
 JO
 ON
 SON
The following is an example that is compatible
with Microsoft SQL Server and MySQL:
SELECT EMP_ID, SUBSTRING(EMP_ID,1,3)
FROM EMPLOYEE_TBL;
 
EMP_ID SUB
 ---------------- ------------------
311549902 311
442346889 442
213764555 213
313782439 313
220984332 220
443679012 443
 
6 rows affected.
…is used to search a string of characters for a
specific set of characters and report the position of
those characters. The syntax is as follows:

INSTR(COLUMN NAME, ‘SET’,[ START POSITION


[ , OCCURRENCE ] ]);
This SQL statement looks for the first
occurrence of the letter A in the PROD_DESC
column:

Click
SELECT PROD_DESC,
INSTR(PROD_DESC,’A’,1,1)
FROM PRODUCTS_TBL;
 
PROD_DESC INSTR(PROD_DESC,’A’,1,1)
 ----------------------------------- ------------------------------------
WITCHES COSTUME 0
PLASTIC PUMPKIN 18 INCH 3
FALSE PARAFFIN TEETH 2
LIGHTED LANTERNS 10
ASSORTED COSTUMES 1
CANDY CORN 2
PUMPKIN CANDY 10
PLASTIC SPIDERS 3
ASSORTED MASKS 1
KEY CHAIN 7
OAK BOOKSHELF 2
 
11 rows selected.
LTRIM
 is used to trim characters from the left of a
string. The syntax is

LTRIM(CHARACTER STRING [ ,’set’ ])


This SQL statement returns the positions and also the
returns the position with the word ‘SALES’ trimmed
from the left side of the character string:
SELECT POSITION, LTRIM(POSITION,’SALES’)
FROM EMPLOYEE_PAY_TBL;
 
POSITION LTRIM(POSITION,
 ----------------------- ----------------------------
MARKETING MARKETING
TEAM LEADER TEAM LEADER
SALES MANAGER MANAGER
SALESMAN MAN
SHIPPER HIPPER
SHIPPER HIPPER
 
6 rows selected.
… is used to trim characters, but this time from
the right of a string. The syntax is
 
RTRIM(CHARACTER STRING [ ,’set’ ])
This SQL statement returns a list of the postions in the PAY_TBL as
well as the positions with the letters ‘ER’ trimmed from the right of
the character string:
SELECT POSITION, RTRIM(POSITION,’ER’)
FROM EMPLOYEE_PAY_TBL;
 
POSITION RTRIM(POSITION,
 ---------------
MARKETING MARKETING
TEAM LEADER TEAM LEAD
SALES MANAGER SALES MANAG
SALESMAN SALESMAN
SHIPPER SHIPP
SHIPPER SHIPP
 
6 rows selected.
… is used to search a string for a value or string,
and if the string is found, an alternative string is
displayed as part of the query results.

The syntax is
 
DECODE(COLUMN NAME, ‘SEARCH1’, ‘RETURN1’,[ ‘SEARCH2’,
‘RETURN2’, ‘DEFAULT VALUE’])
In the following example, DECODE is used on the values for CITY in EMPLOYEE_TBL:

SELECT CITY,
DECODE(CITY,’INDIANAPOLIS’,’INDY’,
‘GREENWOOD’,’GREEN’,’OTHER’)
FROM EMPLOYEE_TBL;
 
CITY DECOD
 ------------------ ---------------------
GREENWOOD GREEN
INDIANAPOLIS INDY
WHITELAND OTHER
INDIANAPOLIS INDY
INDIANAPOLIS INDY
INDIANAPOLIS INDY

6 rows selected.
 LENGTH
- is a common function used to find the length of
a string, number, date, or expression in bytes.
The syntax is

LENGTH(CHARACTER STRING)
This SQL statement returns the product
description and also its corresponding length:

Click
SELECT PROD_DESC, LENGTH(PROD_DESC)
FROM PRODUCTS_TBL;
 
PROD_DESC LENGTH(PROD_DESC)
 --------------------------- --------------------------------
WITCHES COSTUME 15
PLASTIC PUMPKIN 18 INCH 23
FALSE PARAFFIN TEETH 19
LIGHTED LANTERNS 16
ASSORTED COSTUMES 17
CANDY CORN 10
PUMPKIN CANDY 13
PLASTIC SPIDERS 15
ASSORTED MASKS 14
KEY CHAIN 9
OAK BOOKSHELF 13
 
11 rows selected.
… is used to return data from one expression if
another expression is NULL. IFNULL can be used
with most data types; however, the value and the
substitute must be the same data type.

The syntax is

IFNULL(‘VALUE’, ‘SUBSTITUTION’)
This SQL statement finds NULL values and substitutes
9999999999 for any NULL values:
 
SELECT PAGER, IFNULL(PAGER,9999999999)
FROM EMPLOYEE_TBL;
 
PAGER IFNULL(PAGER,
 -------------------- --------------------------
9999999999
9999999999
3175709980 3175709980
8887345678 8887345678
9999999999
9999999999
 
 
6 rows selected.
… is similar to the IFNULL function in that it is used
to specifically replace NULL values within the
result set. The COALESCE function, however, can
accept a whole set of values and checks each one
in order until it finds a non-NULL result. If a non-
NULL result is not present, COALESCE returns a
NULL value.
The following example demonstrates the COALESCE function by
giving us the first non-NULL value of BONUS, SALARY, and
PAY_RATE:
SELECT EMP_ID, COALESCE(BONUS,SALARY,PAY_RATE)
FROM EMPLOYEE_PAY_TBL;
 
EMP_ID COALESCE(BONUS,SALARY,PAY_RATE)
--------------------------  ---------------------------------------------------
213764555 2000.00
220984332 11.00
311549902 40000.00
313782439 1000.00
442346889 14.75
443679012 15.00
 
6 rows selected.
LPAD (left pad) is used to add characters or spaces
to the left of a string. The syntax is
 

LPAD(CHARACTER SET)

The following example pads periods to the left of


each product description, totalling
30 characters between the actual value and
padded periods:
SELECT LPAD(PROD_DESC,30,’.’) PRODUCT
FROM PRODUCTS_TBL;
 
PRODUCT
 -----------------------------------------
...............WITCHES COSTUME
.......PLASTIC PUMPKIN 18 INCH
..........FALSE PARAFFIN TEETH
..............LIGHTED LANTERNS
.............ASSORTED COSTUMES
....................CANDY CORN
.................PUMPKIN CANDY
...............PLASTIC SPIDERS
................ASSORTED MASKS
.....................KEY CHAIN
.................OAK BOOKSHELF
 
11 rows selected.
The RPAD (right pad) is used to add characters or
spaces to the right of a string. The syntax is
 
RPAD(CHARACTER SET)
 
The following example pads periods to the right of
each product description, totalling 30 characters
between the actual value and padded periods:
SELECT RPAD(PROD_DESC,30,’.’) PRODUCT
FROM PRODUCTS_TBL;
 
PRODUCT
 -------------------------------------------
WITCHES COSTUME...............
PLASTIC PUMPKIN 18 INCH.......
FALSE PARAFFIN TEETH..........
LIGHTED LANTERNS..............
ASSORTED COSTUMES.............
CANDY CORN....................
PUMPKIN CANDY.................
PLASTIC SPIDERS...............
ASSORTED MASKS................
KEY CHAIN.....................
OAK BOOKSHELF.................
 
11 rows selected.
You should notice two things regarding the
differences between numeric data types and
character string data types:

1.Arithmetic expressions and functions can be


used on numeric values.

1. Numeric values are right-justified, whereas


character string data types are left- justified in the
output results.
The following is an example of a numeric conversion using an
Oracle conversion function:
 
SELECT EMP_ID, TO_NUMBER(EMP_ID)
FROM EMPLOYEE_TBL;
 
EMP_ID TO_NUMBER(EMP_ID)
----------------- ` ---------------------------
311549902 311549902
442346889 442346889
213764555 213764555
313782439 313782439
220984332 220984332
443679012 443679012
 
6 rows selected.
SELECT PAY = PAY_RATE, NEW_PAY = STR(PAY_RATE)
FROM EMPLOYEE_PAY_TBL
WHERE PAY_RATE IS NOT NULL;
 
PAY NEW_PAY
-------- -----------------
17.5 17.5
14.75 14.75
18.25 18.25
12.8 12.8
11 11
15 15
 
6 rows affected.
Thank you and
God Bless us All!...

You might also like