0% found this document useful (0 votes)

49 views33 pages

Lab: INNER JOIN, GROUP BY, and HAVING Clauses

This document describes exercises involving inner joins, group by, and having clauses in SQL. It introduces the concepts and provides sample queries to get: 1) cities with populations over 1,000,000 or 500,000, ordered by population; 2) countries ordered by unaccounted population, with name, population, total city population, and unaccounted population; 3) the same as 2 but only for countries with total city population over 7,000,000. It breaks problems into subqueries to improve performance and readability.

Uploaded by

Ionel Otmeteanu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views33 pages

Lab: INNER JOIN, GROUP BY, and HAVING Clauses

Uploaded by

Ionel Otmeteanu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Chapter 3.

3rd Lab: INNER JOIN, GROUP BY, and HAVING clauses

Contents
rd
Chapter 3. 3 Lab: INNER JOIN, GROUP BY, and HAVING clauses ....................................................... 1
3.1 Introduction ................................................................................................................................... 1
3.2 Exercises in Access ....................................................................................................................... 2
3.3 Exercises in Oracle ..................................................................................................................... 12
3.4 Best practice rules ....................................................................................................................... 31
3.5 Homework................................................................................................................................... 32

3.1 Introduction
Please recall that:

 Any time when ambiguities are possible in SQL statements, you should prefix column
names with corresponding table instance names.
 Even when no ambiguities are possible, in ON sub-clauses prefixing of all column names
with their corresponding table instance names is compulsory.
 In ORDER BY, ASC is implicit.
 Constants should be hard-codded in SQL only exceptionally; generally, use parameters
instead (e.g. P3.1b below generalizes billion of queries of the type P3.1a below).
 The SQL GROUP BY c1, …, cn clause partitions the set of records obtained by filtering
with the corresponding WHERE clause (if any) the set of records computed by the
corresponding FROM clause according to the equivalence relation ker(c1  …  cn), where
c1, …, cn are columns from that FROM clause table instances.
 Recall that, for any function product f  g, ker(f  g) = ker(f)  ker(g), and that, for any
function f : A  B, ker(f) = {(x, y) A2 | f(x) = f(y)}  A2 (called the kernel or nucleus of
f).
 If you do not give names to your SQL SELECT clause expressions, RDBMSs are assigning
automatically generated ones to them, generally of the type Expr1, Expr2, …
 Although there is no standard for what aggregate functions to be provided, most RDBMSs
offer at least the following most frequently used ones: COUNT (for computing set
cardinals), SUM, AVG (for computing arithmetic means), MIN(imum), and MAX(imum).
 You cannot compose two SQL aggregate functions (although you can compose an
aggregate function with other library functions).

1
 The so-called GROUP BY golden rule states that, in the presence of the GROUP BY clause,
corresponding SELECT clause can only contain columns/expressions listed in the GROUP
BY clause and/or any columns/expressions based on columns of the corresponding FROM
clause, provided that they are arguments of aggregate functions.
 The order of the SQL SELECT clauses is immutable not only for syntactical reasons, but
for conceptual ones too: it is exactly the order in which RDBMSs are evaluating these
queries.

3.2 Exercises in Access

P3.1 a. Compute the set of cities (name, corresponding state and country names, and population)
that have at least 1,000,000 inhabitants, in the descending order of their population and then
ascending on country, state, and city names.

b. Parameterize a. above and compute result for both 1,000,000 and 500,000.

Solution:

How should the result look like:

Inspecting corresponding data instances, obviously, only three cities qualify for the result (in this
order): New York, London, and Bucharest.

The result should then be:

City STATES.State COUNTRIES.Country CITIES.Population

New York New York U.S.A. 8,336,697

London Greater London U.K. 8,308,369

Bucharest Bucharest Romania 1,883,425

Data needed for final result: City, CITIES.Population, STATES.State, and

COUNTRIES.Country;

Data needed to link these three tables’ instances: CITIES.State = STATES.x and
STATES.Country = COUNTRIES.x

Data needed for filtering: CITIES.Population

2
SQL solution:
SELECT City, STATES.State, COUNTRIES.Country,
CITIES.Population
FROM (CITIES INNER JOIN STATES ON CITIES.State = STATES.x)
INNER JOIN COUNTRIES ON STATES.Country = COUNTRIES.x
WHERE CITIES.Population >= 1000000
ORDER BY CITIES.Population DESC, COUNTRIES.Country,
STATES.State, City;

The result of running it against the lab’s db instance is the following:

Figure 3.1 Result of P3.1a

b.
The only difference with respect to the above query is replacing the hard-codded constant
1000000 with a parameter:
SELECT City, STATES.State, COUNTRIES.Country,
CITIES.Population
FROM (CITIES INNER JOIN STATES ON CITIES.State = STATES.x)
INNER JOIN COUNTRIES ON STATES.Country = COUNTRIES.x
WHERE CITIES.Population >=
[Please enter desired minimum city population:]
ORDER BY CITIES.Population DESC, COUNTRIES.Country,
STATES.State, City;
Obviously, the result of running it against the lab’s db instance with the actual parameter value
1000000 is the same as the one in figure 3.1 above.
The result of running it against the lab’s db instance with the actual parameter value 500000 (figure
3.2) also selects Chișinău, Memphis, and Washington (figure 3.3).

3
Figure 3.2 Entering actual parameter value for P3.1b

Figure 3.3 Result of P3.1b for 500,000

P3.2 a. Compute the set of countries (name, population, sum of corresponding cities population,
unaccounted cities population), in the descending order of unaccounted cities population, sum of
corresponding cities population, stored countries population, and then ascending on country
names.

b. Same as a. above, but only for countries for which the sum of cities population is at least
equal to a parameter value; run it for 7,000,000 people.

c. Same as b. above, but only for countries whose names start with ‘R’; run it for 2,500,000
people.

Solution:

How should the result look like:

Inspecting corresponding data instances, obviously, all four countries qualify for the result
(in this order): U.S.A., U.K., Romania, and Moldavia.

The result should then be (where UnaccCityPop = Population - SumCityPop):

4
Country Population SumCityPop UnaccCityPop

U.S.A. 316,836,000 9,624,175 307,211,825

U.K. 63,181,775 8,308,369 54873406

Romania 20,121,641 2,604,964 17,516,677

Moldavia 3,559,500 671,800 2887700

Data needed for final result: COUNTRIES.Country, COUNTRIES.Population, and

CITIES.Population;

Data needed to link these three tables’ instances: CITIES.State = STATES.x and
STATES.Country = COUNTRIES.x

SQL solution:

Both conceptually and from the RDBMSs performance point of view, it is preferable to split
complex problems into smaller and simpler sub-problems and to interconnect in the end their
solutions.

Consequently, let us first solve the sub-problem of computing the sum of cities populations per
countries.

Obviously, by using the SQL aggregate function SUM in the following query, it computes the sum
of all cities populations in the world (see figure 3.4 for its result):

SELECT SUM(Population) AS TotCitiesPop FROM CITIES;

Figure 3.4 The sum of all cities’ populations

For computing total city populations per country, we obviously need to partition cities on group
per countries, such as for SUM to compute totals per countries, instead of the worldwide one:

5
SELECT Sum(CITIES.Population) AS CityPopSum, Country
FROM STATES INNER JOIN CITIES
ON STATES.x = CITIES.State
GROUP BY Country;

Running this query, saved as P3-2-0, against the current lab’s db instance, it is computing the
following result:

Figure 3.5 The sum of all cities’ populations per country

The second sub-problem is to use the results of the previous one for computing final results;
obviously, a join of query P3-2-0 with the COUNTRIES table is needed in order to get both country
names and populations:
SELECT COUNTRIES.Country, COUNTRIES.Population, CityPopSum,
[Population]-[CityPopSum] AS UnaccPop
FROM [P3-2-0] INNER JOIN COUNTRIES
ON [P3-2-0].Country = COUNTRIES.x
ORDER BY [Population]-[CityPopSum] DESC, CityPopSum DESC,
Population DESC, COUNTRIES.Country;

The result of running it against the lab’s db instance is the following:

6
Figure 3.6 Result of P3.2a

Note that, unfortunately, many programmers would actually come up with the following
equivalent, but not optimal solution:

P3-2-0Bis:

SELECT Sum(CITIES.Population) AS CityPopSum,

COUNTRIES.Country
FROM (STATES INNER JOIN CITIES
ON STATES.x = CITIES.State) INNER JOIN COUNTRIES
ON STATES.Country = COUNTRIES.x
GROUP BY COUNTRIES.Country;

P-3-2aBis:
SELECT COUNTRIES.Country, COUNTRIES.Population, CityPopSum,
[Population]-[CityPopSum] AS UnaccPop
FROM [P3-2-0Bis] INNER JOIN COUNTRIES
ON [P3-2-0Bis].Country = COUNTRIES.Country
ORDER BY [Population]-[CityPopSum] DESC, CityPopSum DESC,
Population DESC, COUNTRIES.Country;

Note that P3-2-0B is already taking more time and both memory and disk space, as it makes an
additional join and computes country names (that, in average, have some 32 ASCII chars) instead
of surrogate key values (that need 4 binary bytes).

Much worse is P-3-2aBis, which is joining not on surrogate key values (requiring the fastest –
arithmetic-logic– unit of the CPU and only one memory cycle per comparison), like P-3-2a, but
on ASCII strings (requiring the slowest –decimal– unit of the CPU and an average of 32 memory
cycles per comparison).

Obviously, the only thing that has to be done is to add a HAVING clause to P3-2-0:

7
P3-2-0b:
SELECT Sum(CITIES.Population) AS CityPopSum, STATES.Country
FROM STATES INNER JOIN CITIES ON STATES.x = CITIES.State
GROUP BY STATES.Country
HAVING Sum(CITIES.Population) >= [Please enter desired minimum cities total
population per country:];

Figure 3.7 shows Access’ actual parameter values input window, figures 3.8 – corresponding result
of P3-2-0b for 7,000,000, and 3.9 – the one for the corresponding P-3-2b:

SELECT COUNTRIES.Country, COUNTRIES.Population, CityPopSum,

[Population]-[CityPopSum] AS UnaccPop
FROM [P3-2-0b] INNER JOIN COUNTRIES
ON [P3-2-0b].Country = COUNTRIES.x
ORDER BY [Population]-[CityPopSum] DESC, CityPopSum DESC,
COUNTRIES.Population DESC, COUNTRIES.Country;

Figure 3.7 Entering actual parameter value for P3.2-0b

Figure 3.8 Result of P3.2-0b for 7,000,000 people

8
Figure 3.9 Result of P3.2b for 7,000,000 people

Note that a same result may be obtained with a single statement, by using a subquery (but
generally, subqueries are less fast evaluated by RDBMSs than queries hierarchies):
SELECT COUNTRIES.Country, COUNTRIES.Population, CityPopSum,
[Population]-[CityPopSum] AS UnaccPop
FROM (SELECT Sum(CITIES.Population) AS CityPopSum,
STATES.Country
FROM STATES INNER JOIN CITIES
ON STATES.x = CITIES.State
GROUP BY STATES.Country
HAVING Sum(CITIES.Population) >= [Please enter
desired minimum cities total population per
country:]) AS [P3-2-0b]
INNER JOIN COUNTRIES ON [P3-2-0b].Country = COUNTRIES.x
ORDER BY [Population]-[CityPopSum] DESC, CityPopSum DESC,
COUNTRIES.Population DESC, COUNTRIES.Country;

9
c.

Even if not that obvious, the best thing to do is to add a corresponding filter to P-3-2-0b:

P-3-2-0c:
SELECT Sum(CITIES.Population) AS CityPopSum, STATES.Country
FROM COUNTRIES INNER JOIN (STATES INNER JOIN CITIES
ON STATES.x = CITIES.State)
ON STATES.Country = COUNTRIES.x
WHERE COUNTRIES.Country Like "R*"
GROUP BY STATES.Country
HAVING Sum(CITIES.Population) >= [Please enter desired minimum cities total
population per country:];

Figure 3.10 shows Access’ actual parameter values input window, figures 3.11 – corresponding
result of P3-2-0c for 2,500,000, and 3.12 – the one for the corresponding P-3-2c:

SELECT COUNTRIES.Country, COUNTRIES.Population, CityPopSum,

[Population]-[CityPopSum] AS UnaccPop
FROM [P3-2-0c] INNER JOIN COUNTRIES
ON [P3-2-0c].Country = COUNTRIES.x
ORDER BY [Population]-[CityPopSum] DESC, CityPopSum DESC,
COUNTRIES.Population DESC, COUNTRIES.Country;

Figure 3.10 Entering actual parameter value for P3.2-0c

10
Figure 3.11 Result of P3.2-0c for 2,500,000 people

Figure 3.12 Result of P3.2c for 2,500,000 people

Please note again that, unfortunately, some programmers would rather come up with one of the
following equivalent, but not at all optimal solutions:

P-3-2-0cBis:
SELECT Sum(CITIES.Population) AS CityPopSum, STATES.Country
FROM COUNTRIES INNER JOIN (STATES INNER JOIN CITIES
ON STATES.x = CITIES.State)
ON STATES.Country = COUNTRIES.x
GROUP BY STATES.Country
HAVING COUNTRIES.Country Like "R*" AND

11
Sum(CITIES.Population) >= [Please enter desired
minimum cities total population per country:];

or keeping P-3-2-0b as such and modifying P-3-2c to:

P-3-2cBis:
SELECT COUNTRIES.Country, COUNTRIES.Population, CityPopSum,
[Population]-[CityPopSum] AS UnaccPop
FROM [P3-2-0cBis] INNER JOIN COUNTRIES
ON [P3-2-0cBis].Country = COUNTRIES.x
WHERE COUNTRIES.Country Like "R*"
ORDER BY [Population]-[CityPopSum] DESC, CityPopSum DESC,
COUNTRIES.Population DESC, COUNTRIES.Country;

When you compare them, it is trivial that:

 P3-2-0c is only computing, in this particular case, one group (for Romania) and, generally,
about one dozen group (for Romania, Russia, Rwanda, etc.), whereas
 Both P-3-2-0cBis and P-3-2cBis are still computing all groups (four in this particular case,
but some 250 for full countries’ data) and then are throwing away the vast majority of their
computation results (three groups in this particular case, but some 238 for full countries’
data).

Generally, note that we cannot get rid of HAVING clauses, as this is the only place where we can
add filters on data computed (generally through aggregation) after grouping; dually, we could
sometimes get rid of WHERE clauses (not always, as –see figure 3.4 above– sometimes we might
need filters on global applications of aggregate functions) and only use HAVING ones, but this
would be a stupid thing to do, both conceptually and, especially, performance-wise.

3.3 Exercises in Oracle

b. Parameterize a. above and compute results for both 1,000,000 and 500,000.

Solution:

How should the result look like:

12
Inspecting corresponding data instances, obviously, only three cities qualify for the result (in this
order): New York, London, and Bucharest.

The result should then be:

City STATES.State COUNTRIES.Country CITIES.Population

New York New York U.S.A. 8,336,697

London Greater London U.K. 8,308,369

Bucharest Bucharest Romania 1,883,425

Data needed for final result: City, CITIES.Population, STATES.State, and

COUNTRIES.Country;

Data needed to link these three tables’ instances: CITIES.State = STATES.x and
STATES.Country = COUNTRIES.x

Data needed for filtering: CITIES.Population

SQL solution:
SELECT City, STATES.State, COUNTRIES.Country,
CITIES.Population
FROM (CITIES INNER JOIN STATES ON CITIES.State = STATES.x)
INNER JOIN COUNTRIES ON STATES.Country = COUNTRIES.x
WHERE CITIES.Population >= 1000000
ORDER BY CITIES.Population DESC, COUNTRIES.Country,
STATES.State, City;

The result of running it against the lab’s db instance is the following:

13
Figure 3.13 Result of P3.1a

14
b.

The only difference with respect to the above query is replacing the hard-codded constant
1000000 with a parameter:
SELECT City, STATES.State, COUNTRIES.Country,
CITIES.Population
FROM (CITIES INNER JOIN STATES ON CITIES.State = STATES.x)
INNER JOIN COUNTRIES ON STATES.Country = COUNTRIES.x
WHERE CITIES.Population >=
:Minimum_city_population
ORDER BY CITIES.Population DESC, COUNTRIES.Country,
STATES.State, City;
Obviously, the result of running it against the lab’s db instance with the actual parameter value
1000000 is the same as the one in figure 3.13 above.
The result of running it against the lab’s db instance with the actual parameter value 500000 (figure
3.14) also selects Chișinău, Memphis, and Washington (figure 3.15). Note that Oracle variable
names can be of at most 30 chars and cannot contain spaces.

Figure 3.14 Entering actual parameter value for P3.1b

15
Figure 3.15 Result of P3.1b for 500,000

b. Same as a. above, but only for countries for which the sum of cities population is at least
equal to a parameter value; run it for 7,000,000 people.

c. Same as b. above, but only for countries whose names start with ‘R’; run it for 2,500,000
people.

Solution:

How should the result look like:

Inspecting corresponding data instances, obviously, all four countries qualify for the result
(in this order): U.S.A., U.K., Romania, and Moldavia.

The result should then be (where UnaccCityPop = Population - SumCityPop):

16
Country Population SumCityPop UnaccCityPop

U.S.A. 316,836,000 9,624,175 307,211,825

U.K. 63,181,775 8,308,369 54873406

Romania 20,121,641 2,604,964 17,516,677

Moldavia 3,559,500 671,800 2887700

Data needed for final result: COUNTRIES.Country, COUNTRIES.Population, and

CITIES.Population;

Data needed to link these three tables’ instances: CITIES.State = STATES.x and
STATES.Country = COUNTRIES.x

SQL solution:

Both conceptually and from the RDBMSs performance point of view, it is preferable to split
complex problems into smaller and simpler sub-problems and to interconnect in the end their
solutions.

Consequently, let us first solve the sub-problem of computing the sum of cities populations per
countries.

Obviously, by using the SQL aggregate function SUM in the following query, it computes the sum
of all cities populations in the world (see figure 3.4 for its result):

SELECT SUM(Population) AS TotCitiesPop FROM CITIES;

17
Figure 3.16 The sum of all cities’ populations

For computing total city populations per country, we obviously need to partition cities on group
per countries, such as for SUM to compute totals per countries, instead of the worldwide one:

SELECT Sum(CITIES.Population) AS CityPopSum, Country

FROM STATES INNER JOIN CITIES
ON STATES.x = CITIES.State
GROUP BY Country;

Running this query against the current lab’s db instance is computing the following result:

18
Figure 3.17 The sum of all cities’ populations per country

In order to make use of it in the final step, you should save this query as view P3-2-0; right-click
the View node of LAB_DB, then click on New View (figure 3.18); in the Create View window that
pops up (figure 3.19), enter the Name of the view (that should be distinct from names of any other
tables and views of LAB_DB) and copy the statement in the SQL Query text box; click on the
Check Syntax button: the message “SQL Parse Results: No errors found in SQL” should be
displayed in the bottom-left corner of the window; click on the Test Query button: the Test Query
window that pops up (figure 3.20) should display the “Query executed successfully” Result; click
on Close and then on the OK button of the Create View (figure 3.19): your view is saved and ready
to be used.

19
Figure 3.18 Creating a new view in Oracle SQL Developer

Figure 3.19 Naming and specifying a view in Oracle SQL Developer

20
Figure 3.20 Testing a view in Oracle SQL Developer

Note that the same result could have been obtained by running the following DDL statement:
--------------------------------------------------------
-- DDL for View P3_2_0
--------------------------------------------------------
CREATE OR REPLACE FORCE VIEW "LAB_DB"."P3_2_0"
("CITYPOPSUM", "COUNTRY") AS
SELECT SUM(CITIES.POPULATION) AS CityPopSum, COUNTRY
FROM STATES INNER JOIN CITIES ON STATES.X = CITIES.STATE
GROUP BY COUNTRY;
The second sub-problem is to use the results of the previous one for computing final results;
obviously, a join of view P3-2-0 with the COUNTRIES table is needed in order to get both country
names and populations:
SELECT COUNTRIES.Country, COUNTRIES.Population, CityPopSum,
Population - CityPopSum AS UnaccPop
FROM P3_2_0 INNER JOIN COUNTRIES
ON P3_2_0.Country = COUNTRIES.x
ORDER BY Population - CityPopSum DESC, CityPopSum DESC,
Population DESC, COUNTRIES.Country;
The result of running it against the lab’s db instance is the following:

21
Figure 3.21 Result of P3.2a

Note that, unfortunately, many programmers would actually come up with the following
equivalent, but not optimal solution:

P3_2_0Bis:

SELECT Sum(CITIES.Population) AS CityPopSum,

COUNTRIES.Country
FROM (STATES INNER JOIN CITIES
ON STATES.x = CITIES.State) INNER JOIN COUNTRIES
ON STATES.Country = COUNTRIES.x
GROUP BY COUNTRIES.Country;

P_3_2aBis:
SELECT COUNTRIES.Country, COUNTRIES.Population, CityPopSum,
Population - CityPopSum AS UnaccPop
FROM P3_2_0Bis INNER JOIN COUNTRIES
ON P3_2_0Bis.Country = COUNTRIES.Country
ORDER BY Population - CityPopSum DESC, CityPopSum DESC,
Population DESC, COUNTRIES.Country;

Note that P3_2_0B is already taking more time and both memory and disk space, as it makes an
additional join and computes country names (that, in average, have some 32 ASCII chars) instead
of surrogate key values (that need 4 binary bytes).

Much worse is P_3_2aBis, which is joining not on surrogate key values (requiring the fastest –
arithmetic-logic– unit of the CPU and only one memory cycle per comparison), like P_3_2a, but

22
on ASCII strings (requiring the slowest –decimal– unit of the CPU and an average of 32 memory
cycles per comparison).

Obviously, the only thing that has to be done is to add a HAVING clause to P3_2_0; here is the
corresponding P3_2_0b:
SELECT Sum(CITIES.Population) AS CityPopSum, STATES.Country
FROM STATES INNER JOIN CITIES ON STATES.x = CITIES.State
GROUP BY STATES.Country
HAVING Sum(CITIES.Population) >= :Min_city_tot_pop_per_country;

Figure 3.22 shows Oracle’s actual parameter values input window, figures 3.23 – corresponding
result of P3_2_0b for 7,000,000, and 3.24 – the one for the corresponding P_3_2b (with P_3_2_0b
as a subquery):

SELECT COUNTRIES.Country, COUNTRIES.Population, CityPopSum,

Population - CityPopSum AS UnaccPop
FROM (SELECT Sum(CITIES.Population) AS CityPopSum,
STATES.Country
FROM STATES INNER JOIN CITIES
ON STATES.x = CITIES.State
GROUP BY STATES.Country
HAVING Sum(CITIES.Population) >=
:Min_city_tot_pop_per_country) P3_2_0b
INNER JOIN COUNTRIES ON P3_2_0b.Country = COUNTRIES.x
ORDER BY Population - CityPopSum DESC, CityPopSum DESC,
COUNTRIES.Population DESC, COUNTRIES.Country;

Figure 3.22 Entering actual parameter value for P3_2_0b

23
Figure 3.23 Result of P3_2_0b for 7,000,000 people

Figure 3.24 Result of P3_2b for 7,000,000 people

24
c.

Even if not that obvious, the best thing to do is to add a corresponding filter to P_3_2_0b:

P_3_2_0c:
SELECT Sum(CITIES.Population) AS CityPopSum, STATES.Country
FROM COUNTRIES INNER JOIN (STATES INNER JOIN CITIES
ON STATES.x = CITIES.State)
ON STATES.Country = COUNTRIES.x
WHERE COUNTRIES.Country Like ‘R%’
GROUP BY STATES.Country
HAVING Sum(CITIES.Population) >=
:Min_city_tot_pop_per_country;

Figure 3.25 shows Oracle’ actual parameter values input window, figures 3.26 – corresponding
result of P3_2_0c for 2,500,000, and 3.27 – the one for the corresponding P_3_2c:

SELECT COUNTRIES.Country, COUNTRIES.Population, CityPopSum,

Population - CityPopSum AS UnaccPop
FROM (SELECT Sum(CITIES.Population) AS CityPopSum,
STATES.Country
FROM COUNTRIES INNER JOIN (STATES INNER JOIN CITIES
ON STATES.x = CITIES.State)
ON STATES.Country = COUNTRIES.x
WHERE COUNTRIES.Country Like ‘R%’
GROUP BY STATES.Country
HAVING Sum(CITIES.Population) >=
:Min_city_tot_pop_per_country)
P3_2_0c INNER JOIN COUNTRIES
ON P3_2_0c.Country = COUNTRIES.x
ORDER BY Population - CityPopSum DESC, CityPopSum DESC,
COUNTRIES.Population DESC, COUNTRIES.Country;

Figure 3.25 Entering actual parameter value for P3_2_0c

25
Figure 3.26 Result of P3_2_0c for 2,500,000 people

Figure 3.27 Result of P3_2c for 2,500,000 people

Unfortunately, Oracle does not accept parameterized views; consequently, the only way to store
and run parameterized queries are the PL/SQL stored procedures; as it is best to group all such
procedures addressing some same functional specifications in a PL/SQL package, let us create such
a package. Right-click the Packages node of LAB_DB and then click on New Package:

26
Figure 3.28 Creating a new PL/SQL package

In the Create PL/SQL Package window that pops up, enter desired package name:

Figure 3.29 Naming a new PL/SQL package

In the header of the newly created package, replace the comment /* To do … */ with the following
two declarations (see figure 3.30):
TYPE GenericCursorType IS REF CURSOR;
procedure p3_2c (min_city_tot_pop_per_country number,
rc OUT GenericCursorType);

27
Figure 3.30 The header of the LAB_DB_SQL PL/SQL package

For creating the package body, right-click on the package’s name and then click on Create Body…:

Figure 3.31 Creating the body of the LAB_DB_SQL PL/SQL package

In the newly created body, enter procedure’s P3_2c definition (see figure 3.32):
procedure p3_2c
(
Min_city_tot_pop_per_country in number,
rc out GenericCursorType
) is
begin
open rc for
SELECT COUNTRIES.Country, COUNTRIES.Population,
CityPopSum, Population - CityPopSum AS UnaccPop
FROM (SELECT Sum(CITIES.Population) AS CityPopSum,
STATES.Country
FROM COUNTRIES INNER JOIN (STATES INNER JOIN CITIES

28
ON STATES.x = CITIES.State)
ON STATES.Country = COUNTRIES.x
WHERE COUNTRIES.Country Like 'R%'
GROUP BY STATES.Country
HAVING Sum(CITIES.Population) >=
Min_city_tot_pop_per_country)
P3_2_0c INNER JOIN COUNTRIES
ON P3_2_0c.Country = COUNTRIES.x
ORDER BY Population - CityPopSum DESC, CityPopSum DESC,
COUNTRIES.Population DESC, COUNTRIES.Country;
end p3_2c;

In order to run this packaged procedure with desired parameters, enter in a LAB_DB SQL tab the
following statements:

var c refcursor;
exec lab_db_pl_sql.p3_2c(2500000, :c);
print c;

Running them (see figure 3.33), you get same results as in figure 3.26 above; the main advantage
with this approach is that you can obtain and then process this result from now on
programmatically too (e.g. in VBA, Java, .NET, etc.).

Figure 3.32 The body of the LAB_DB_SQL PL/SQL package

29
Figure 3.33 Running the LAB_DB_SQL.P3_2C PL/SQL packaged procedure

Please note again that, unfortunately, some programmers would rather come up with one of the
following equivalent, but not at all optimal solutions:

P_3_2_0cBis:
SELECT Sum(CITIES.Population) AS CityPopSum, STATES.Country
FROM COUNTRIES INNER JOIN (STATES INNER JOIN CITIES
ON STATES.x = CITIES.State)
ON STATES.Country = COUNTRIES.x
GROUP BY STATES.Country
HAVING COUNTRIES.Country Like ‘R%’ AND
Sum(CITIES.Population) >= : Min_city_tot_pop_per_country;

or keeping P_3_2_0c as such and modifying P_3_2c to:

30
P_3_2cBis:
SELECT COUNTRIES.Country, COUNTRIES.Population, CityPopSum,
Population - CityPopSum AS UnaccPop
FROM (SELECT Sum(CITIES.Population) AS CityPopSum,
STATES.Country
FROM COUNTRIES INNER JOIN (STATES INNER JOIN CITIES
ON STATES.x = CITIES.State)
ON STATES.Country = COUNTRIES.x
GROUP BY STATES.Country
HAVING Sum(CITIES.Population) >=
:Min_city_tot_pop_per_country)
P3_2_0c INNER JOIN COUNTRIES
ON P3_2_0c.Country = COUNTRIES.x
WHERE COUNTRIES.Country Like ‘R%’
ORDER BY Population - CityPopSum DESC, CityPopSum DESC,
COUNTRIES.Population DESC, COUNTRIES.Country;

When you compare them, it is trivial that:

 P3_2_0c is only computing, in this particular case, one group (for Romania) and, generally,
about one dozen group (for Romania, Russia, Rwanda, etc.), whereas
 Both P_3_2_0cBis and P_3_2cBis are still computing all groups (four in this particular
case, but some 250 for full countries’ data) and then are throwing away the vast majority
of their computation results (three groups in this particular case, but some 238 for full
countries’ data).

3.4 Best practice rules

BPR3.0 Always name your SQL SELECT clause expressions with proper names (by using
the AS renaming operator).

BPR3.1 Both conceptually and from the RDBMSs performance point of view, it is preferable to
split complex problems into smaller and simpler sub-problems and to interconnect in the end
their solutions.

BPR3.2 Always use only necessary data (dually: never use unnecessary tables and/or
columns) in your queries.

31
BPR3.3 When possible, always join table instances on smallest numerical keys (generally,
primary surrogate ones), instead of any other existing equivalent keys.

BPR3.4 Always use WHERE for filtering as much as possible before grouping.

BPR3.5 Use HAVING only for filtering on data computed after grouping (dually: never use
HAVING for filters that can be placed on WHERE!).

BPR3.6 Never present users with unordered results, except for cases when they are explicitly
asking for it.

BPR3.7 Always order results intelligently, such as to maximize users experience with your
application.

BPR3.8 Never order data more than once, in the final querying step.

BPR3.9 Never order on more columns/expressions than needed: ordering costs a lot!

BPR3.10 Never order by using column positions! For example, always use SELECT x, y …
ORDER BY x, y; never use SELECT x, y … ORDER BY 1, 2; instead, as, one day,
when you will have to change it to SELECT y, x … ORDER BY x, y;, you have to also
change the ordering order (to SELECT y, x … ORDER BY x, y;) .

3.5 Homework
H3.0 Prove that:

a. the functions kernel relation is an equivalence one.

b. the kernel of a function product is equal to the intersection of involved kernels

H3.1 Prove that there is no SQL solution for P3.2 above without subqueries or queries hierarchies.

Hint: consider both the “GROUP BY golden rule” and the restriction that aggregate functions
cannot be composed between them.

H3.2 Compute the set of countries having at least k states, each of which has at least n cities (k and
n being natural parameters), for which the unaccounted states and cities population per countries
are at least equal to other two distinct parameters, respectively, in the descending order of the
unaccounted states population per country, city population per country, corresponding accounted
ones, stored countries population, and then ascending on country name.

p.s. It is highly possible that an exercise of this type, generally simpler from the arithmetic point
of view, be the main oral examination subject at the end of this semester!

H3.3 a. Add to the COUNTRIES table data for Hungary, Serbia, Bulgaria, Greece, Malta, and
Ukraine.

32
b. Add to your lab db a table for storing the NEIGHBORS binary relation defined over
COUNTRIES: NEIGHBORS = { (x,y)  COUNTRIES2 |x is neighbor to y } and populate it with
actual data for all countries in COUNTRIES.

H3.4 Translate into relational algebra and optimize all the SELECT statements from these first
three DB labs.

Microsoft Official Course: Implementing Failover Clustering
No ratings yet
Microsoft Official Course: Implementing Failover Clustering
44 pages
Lec 7
No ratings yet
Lec 7
40 pages
Edi x12 Vs Edifact
No ratings yet
Edi x12 Vs Edifact
3 pages
Wireless Technologies For Iot: Unit 3
No ratings yet
Wireless Technologies For Iot: Unit 3
24 pages
Ultimate SQL Cheat Sheet 1680810988
No ratings yet
Ultimate SQL Cheat Sheet 1680810988
9 pages
CDC Guide For Linux, Unix, and Windows: Informatica Powerexchange (Version 9.0)
No ratings yet
CDC Guide For Linux, Unix, and Windows: Informatica Powerexchange (Version 9.0)
187 pages
DBMS Mod 3
No ratings yet
DBMS Mod 3
223 pages
Structured Query Languages SQL
No ratings yet
Structured Query Languages SQL
170 pages
Group 9 Project Proposal Business Permit
No ratings yet
Group 9 Project Proposal Business Permit
12 pages
SQL - Unit II-2
No ratings yet
SQL - Unit II-2
131 pages
3+years Oracle DBA Interview Questions
No ratings yet
3+years Oracle DBA Interview Questions
3 pages
Database Unit II
No ratings yet
Database Unit II
170 pages
Advanced Databases Practical Lecture 1
No ratings yet
Advanced Databases Practical Lecture 1
99 pages
SQL Notes
100% (1)
SQL Notes
42 pages
Explorer: SANS DFIR Cheat Sheet
No ratings yet
Explorer: SANS DFIR Cheat Sheet
4 pages
Fundamentals of Data Structures Lab Manual
No ratings yet
Fundamentals of Data Structures Lab Manual
52 pages
SQL Cheatsheet
No ratings yet
SQL Cheatsheet
6 pages
5 Ra-Sql2
No ratings yet
5 Ra-Sql2
48 pages
Lecture8 SQL PartI Jan30 2018
No ratings yet
Lecture8 SQL PartI Jan30 2018
52 pages
RDMBS MySQL Lesson 6
No ratings yet
RDMBS MySQL Lesson 6
44 pages
Database Nest Quiz
No ratings yet
Database Nest Quiz
22 pages
Lecture7 Fall
No ratings yet
Lecture7 Fall
53 pages
SQL Advanced
No ratings yet
SQL Advanced
60 pages
Introductory SQL 2
No ratings yet
Introductory SQL 2
43 pages
Week 7.1
No ratings yet
Week 7.1
37 pages
LECTURE - 9 Aggregation and Grouping
No ratings yet
LECTURE - 9 Aggregation and Grouping
39 pages
DBMS - Lecture 7 Functions
No ratings yet
DBMS - Lecture 7 Functions
47 pages
More SQL: Extended Relational Algebra Outerjoins, Grouping/Aggregation Insert/Delete/Update
No ratings yet
More SQL: Extended Relational Algebra Outerjoins, Grouping/Aggregation Insert/Delete/Update
46 pages
Lecture 4 SQL Adv II PDF
No ratings yet
Lecture 4 SQL Adv II PDF
47 pages
Unit 4
No ratings yet
Unit 4
29 pages
SQL Cheat Sheet My Analytics School
No ratings yet
SQL Cheat Sheet My Analytics School
21 pages
Thinking in Sets
No ratings yet
Thinking in Sets
37 pages
Wireless 4-Port USB Sharing Station: GUWIP204
No ratings yet
Wireless 4-Port USB Sharing Station: GUWIP204
3 pages
SQL Solutions
No ratings yet
SQL Solutions
13 pages
P S T U: Atuakhali Cience AND Echnology Niversity
No ratings yet
P S T U: Atuakhali Cience AND Echnology Niversity
20 pages
SQL 4
No ratings yet
SQL 4
12 pages
PL/SQL: by Jean Baptiste MINANI
No ratings yet
PL/SQL: by Jean Baptiste MINANI
56 pages
Group and Aggregation Introduction
No ratings yet
Group and Aggregation Introduction
21 pages
SQL Basics Cheat Sheet
No ratings yet
SQL Basics Cheat Sheet
2 pages
SQL Summary Version 5
No ratings yet
SQL Summary Version 5
7 pages
SQL Basics Cheat Sheet A4
No ratings yet
SQL Basics Cheat Sheet A4
2 pages
SQL Statements With Aggregation and Filtering
No ratings yet
SQL Statements With Aggregation and Filtering
13 pages
Removable Storage Media Policy V0.1
No ratings yet
Removable Storage Media Policy V0.1
6 pages
IMSI MSC Call Trace (Invoke Trace) : Motorola Jordan
No ratings yet
IMSI MSC Call Trace (Invoke Trace) : Motorola Jordan
7 pages
Manual No 3
No ratings yet
Manual No 3
8 pages
SQL Cheat Sheet A4 GOOD
100% (1)
SQL Cheat Sheet A4 GOOD
4 pages
SQL Statements Sheet PDF
No ratings yet
SQL Statements Sheet PDF
9 pages
CSC Data Base Assignment
No ratings yet
CSC Data Base Assignment
20 pages
Lab 7 - (Queries II)
No ratings yet
Lab 7 - (Queries II)
8 pages
Week 5
No ratings yet
Week 5
11 pages
SQL 1690091465
No ratings yet
SQL 1690091465
6 pages
SQL Basic Cheat Sheet
No ratings yet
SQL Basic Cheat Sheet
3 pages
Microsemi SmartFusion2 ISP Using UART Interface DG0454
No ratings yet
Microsemi SmartFusion2 ISP Using UART Interface DG0454
33 pages
DBMS Queries C-3
No ratings yet
DBMS Queries C-3
7 pages
Answer
No ratings yet
Answer
10 pages
r23 Dbms Record
No ratings yet
r23 Dbms Record
8 pages
02 Advancedsql
No ratings yet
02 Advancedsql
5 pages
SQL Basics Cheat Sheet A4
No ratings yet
SQL Basics Cheat Sheet A4
2 pages
SQL Cheat Sheet 2
No ratings yet
SQL Cheat Sheet 2
4 pages
SQL Basics Cheat Sheet Letter 02
No ratings yet
SQL Basics Cheat Sheet Letter 02
2 pages
SQL Basics Cheat Sheet A4
No ratings yet
SQL Basics Cheat Sheet A4
2 pages
SQL Queries
No ratings yet
SQL Queries
7 pages
Chapter 9 of The Postgres Documentation
No ratings yet
Chapter 9 of The Postgres Documentation
4 pages
DBMS Lab 3 Tasks
No ratings yet
DBMS Lab 3 Tasks
2 pages
Assignment 2
No ratings yet
Assignment 2
5 pages
SQL
No ratings yet
SQL
2 pages
SQL Basics Cheat Sheet Ledger
No ratings yet
SQL Basics Cheat Sheet Ledger
1 page
SQL Basics Cheat Sheet A3
No ratings yet
SQL Basics Cheat Sheet A3
1 page
Unit 3 Operating System by B Lokesh Joel Deadlocks
No ratings yet
Unit 3 Operating System by B Lokesh Joel Deadlocks
74 pages
1.1 Data Communication A. Network Criteria
No ratings yet
1.1 Data Communication A. Network Criteria
3 pages
Normalization: ITM 692 Sanjay Goel
No ratings yet
Normalization: ITM 692 Sanjay Goel
34 pages
Ranged Queries Using Bloom Filters Final
No ratings yet
Ranged Queries Using Bloom Filters Final
19 pages
Microprocessor Unit 3 Part 3
No ratings yet
Microprocessor Unit 3 Part 3
12 pages
Database Programming - by Nadeera
No ratings yet
Database Programming - by Nadeera
12 pages
Android Sqlite Example Application
No ratings yet
Android Sqlite Example Application
10 pages
ICT-Note Full and Final
No ratings yet
ICT-Note Full and Final
10 pages
iFIX HMI Installation Guide
No ratings yet
iFIX HMI Installation Guide
35 pages
Photography Catalogue
No ratings yet
Photography Catalogue
8 pages
05 Aviral Mer
No ratings yet
05 Aviral Mer
60 pages
MySQL 01
No ratings yet
MySQL 01
3 pages
NAME:-Vinay Thakur Reg - No:-11908213 Section:-D1901
No ratings yet
NAME:-Vinay Thakur Reg - No:-11908213 Section:-D1901
10 pages
Chapter 1 Introduction-1
No ratings yet
Chapter 1 Introduction-1
28 pages
Year 6 PPT Data Table Introduction
No ratings yet
Year 6 PPT Data Table Introduction
13 pages
TCP Acceleration Only
No ratings yet
TCP Acceleration Only
9 pages
Cognos Report Studio Interview Questions
No ratings yet
Cognos Report Studio Interview Questions
5 pages
CA Tut14 ANS
No ratings yet
CA Tut14 ANS
2 pages
Responsible Data Science
From Everand
Responsible Data Science
Peter C. Bruce
No ratings yet
Statistical Analysis with R For Dummies
From Everand
Statistical Analysis with R For Dummies
Joseph Schmuller
5/5 (1)
Graphs with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
From Everand
Graphs with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
Peter Kattan
4/5 (2)
Trifocal Tensor: Exploring Depth, Motion, and Structure in Computer Vision
From Everand
Trifocal Tensor: Exploring Depth, Motion, and Structure in Computer Vision
Fouad Sabry
No ratings yet

Lab: INNER JOIN, GROUP BY, and HAVING Clauses

Uploaded by

Lab: INNER JOIN, GROUP BY, and HAVING Clauses

Uploaded by

Chapter 3.

3rd Lab: INNER JOIN, GROUP BY, and HAVING clauses

3.2 Exercises in Access

How should the result look like:

The result should then be:

City STATES.State COUNTRIES.Country CITIES.Population

New York New York U.S.A. 8,336,697

London Greater London U.K. 8,308,369

Bucharest Bucharest Romania 1,883,425

Data needed for final result: City, CITIES.Population, STATES.State, and

Data needed for filtering: CITIES.Population

The result of running it against the lab’s db instance is the following:

Figure 3.1 Result of P3.1a

Figure 3.3 Result of P3.1b for 500,000

How should the result look like:

The result should then be (where UnaccCityPop = Population - SumCityPop):

U.S.A. 316,836,000 9,624,175 307,211,825

U.K. 63,181,775 8,308,369 54873406

Romania 20,121,641 2,604,964 17,516,677

Moldavia 3,559,500 671,800 2887700

Data needed for final result: COUNTRIES.Country, COUNTRIES.Population, and

SELECT SUM(Population) AS TotCitiesPop FROM CITIES;

Figure 3.4 The sum of all cities’ populations

Figure 3.5 The sum of all cities’ populations per country

The result of running it against the lab’s db instance is the following:

SELECT Sum(CITIES.Population) AS CityPopSum,

SELECT COUNTRIES.Country, COUNTRIES.Population, CityPopSum,

Figure 3.7 Entering actual parameter value for P3.2-0b

Figure 3.8 Result of P3.2-0b for 7,000,000 people

SELECT COUNTRIES.Country, COUNTRIES.Population, CityPopSum,

Figure 3.10 Entering actual parameter value for P3.2-0c

Figure 3.12 Result of P3.2c for 2,500,000 people

or keeping P-3-2-0b as such and modifying P-3-2c to:

When you compare them, it is trivial that:

3.3 Exercises in Oracle

How should the result look like:

The result should then be:

City STATES.State COUNTRIES.Country CITIES.Population

New York New York U.S.A. 8,336,697

London Greater London U.K. 8,308,369

Bucharest Bucharest Romania 1,883,425

Data needed for final result: City, CITIES.Population, STATES.State, and

Data needed for filtering: CITIES.Population

The result of running it against the lab’s db instance is the following:

Figure 3.14 Entering actual parameter value for P3.1b

How should the result look like:

The result should then be (where UnaccCityPop = Population - SumCityPop):

U.S.A. 316,836,000 9,624,175 307,211,825

U.K. 63,181,775 8,308,369 54873406

Romania 20,121,641 2,604,964 17,516,677

Moldavia 3,559,500 671,800 2887700

Data needed for final result: COUNTRIES.Country, COUNTRIES.Population, and

SELECT SUM(Population) AS TotCitiesPop FROM CITIES;

SELECT Sum(CITIES.Population) AS CityPopSum, Country

Figure 3.19 Naming and specifying a view in Oracle SQL Developer

SELECT Sum(CITIES.Population) AS CityPopSum,

SELECT COUNTRIES.Country, COUNTRIES.Population, CityPopSum,

Figure 3.22 Entering actual parameter value for P3_2_0b

Figure 3.24 Result of P3_2b for 7,000,000 people

SELECT COUNTRIES.Country, COUNTRIES.Population, CityPopSum,

Figure 3.25 Entering actual parameter value for P3_2_0c

Figure 3.27 Result of P3_2c for 2,500,000 people

Figure 3.29 Naming a new PL/SQL package

Figure 3.31 Creating the body of the LAB_DB_SQL PL/SQL package

Figure 3.32 The body of the LAB_DB_SQL PL/SQL package

or keeping P_3_2_0c as such and modifying P_3_2c to:

When you compare them, it is trivial that:

3.4 Best practice rules

a. the functions kernel relation is an equivalence one.

b. the kernel of a function product is equal to the intersection of involved kernels

You might also like