0% found this document useful (0 votes)
37 views14 pages

337 Lecture-04

Uploaded by

iraverdique1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views14 pages

337 Lecture-04

Uploaded by

iraverdique1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Lecture-4

Introduction to SQL
Part-II

String Operations
• SQL specifies strings by enclosing them in single quotes, for example, 'Computer'.
• A single quote character that is part of a string can be specified by using two single
quote characters; for example, the string “It’s right” can be specified by 'It''s right'.
• The SQL standard specifies that the equality operation on strings is case sensitive; as a
result, the expression “'comp. sci.' = 'Comp. Sci.'” evaluates to false.
• However, some database systems, such as MySQL and SQL Server, do not distinguish
uppercase from lowercase when matching strings; as a result, “'comp. sci.' = 'Comp.
Sci.'” would evaluate to true on these systems.
• This default behaviour can, however, be changed, either at the database level or at
the level of specific attributes.
• SQL also permits a variety of functions on character strings, such as concatenating
(using “∥”), extracting substrings, finding the length of strings, converting strings to
uppercase (using the function upper(s) where s is a string) and lowercase (using the
function lower(s)), removing spaces at the end of the string (using trim(s)), and so on.
• There are variations on the exact set of string functions supported by different
database systems. You better check your database system’s manual for more details
on exactly what string functions it supports.

Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 2

1
String Operations – cnt
• Pattern matching can be performed on strings using the operator like. We describe
patterns by using two special characters:
o Percent ( % ): The % character matches any substring
o Underscore ( _ ): The _ character matches any character

• Patterns are case sensitive; that is, uppercase characters do not match lowercase
characters, or vice versa.

• To illustrate pattern matching, we consider the following examples:


o 'Intro%' matches any string beginning with “Intro”
o '%Comp%' matches any string containing “Comp” as a substring, for example, 'Intro. to
Computer Science', and 'Computational Biology'
o ' _ _ _ ' matches any string of exactly three characters
o ' _ _ _ %' matches any string of at least three characters

Wildcard characters might differ between different systems. For instance,


MS Access uses the character '?' whilst MS SQL Server uses the character '_'.
Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 3

String Operations – cnt


• SQL expresses patterns by using the like comparison operator.
• Consider the query “Find the names of all departments whose building name includes
the substring 'Watson'.”
• This query can be written as:
SQL allows us to search for mismatches
instead of matches by using the not like
comparison operator.

• For patterns to include the special pattern characters (that is, % and _ ), SQL allows
the specification of an escape character.
• The escape character is used immediately before a special pattern character to
indicate that the special pattern character is to be treated like a normal character.
• We define the escape character for a like comparison using the escape keyword. To
illustrate, consider the following patterns, which use a backslash (∖) as the escape
character:

Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 4

2
Attribute Specification in the Select Clause
• The asterisk symbol “*” can be used in the select clause to denote “all attributes.”

• Thus, the use of instructor.* in the select clause of the query:

indicates that all attributes of the instructor are to be selected.

• A select clause of the form select * indicates that all attributes of the result relation of
the from clause are selected.

Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 5

Ordering the Display of Tuples


• SQL offers the user some control over the order in which tuples in a relation are
displayed.

• The order by clause causes the tuples in the result of a query to appear in sorted
order.

• To list in alphabetic order all instructors in the Physics department, we write:

• By default, the order by clause lists items in ascending order.

• To specify the sort order, we may specify desc for descending order or asc for
ascending order.
Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 6

3
Ordering the Display of Tuples – cnt
• Furthermore, ordering can be performed on multiple attributes.

• Suppose that we wish to list the entire instructor relation in descending order of
salary. If several instructors have the same salary, we order them in ascending order
by name.

• We express this query in SQL as follows:

Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 7

Where-Clause Predicates
• SQL includes a between comparison operator to simplify where clauses that specify
that a value be less than or equal to some value and greater than or equal to some
other value.
• If we wish to find the names of instructors with salary amounts between $90,000 and
$100,000, we can use the between comparison to write:

instead of:

• Similarly, we can use the not between comparison operator.

Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 8

4
Where-Clause Predicates – cnt
• SQL permits us to use the notation (v1, v2,…, vn) to denote a tuple of arity n containing
values v1, v2,…, vn; the notation is called a row constructor.
• The comparison operators can be used on tuples, and the ordering is defined
lexicographically.
• For example, (a1, a2) <= (b1, b2) is true if a1 <= b1 and a2 <= b2; similarly, the two tuples
are equal if all their attributes are equal.
• Thus, the SQL query:

can be rewritten as follows:


Although it is part of the
SQL-92 standard, some SQL
implementations, notably
Oracle, do not support this
syntax.

Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 9

Set Operations
• The SQL operations union, intersect, and except operate on relations and correspond
to the mathematical set operations ∪, ∩, and −.
• We shall now construct queries involving the union, intersect, and except operations
over two sets.

• The set of all courses taught in the Fall 2017 semester:

• The set of all courses taught in the Spring 2018 semester:

Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 10

5
Union Operation
• To find the set of all courses taught either in Fall 2017 or in Spring 2018, or both, we
write the following query.

• The union operation (like other set operations) automatically eliminates duplicates,
unlike the select clause.
• Note that the parentheses we include around each select-from-where statement are
optional but valuable for ease of reading; some databases do not allow the use of the
parentheses, for instance, SQLite, in which case they may drop.

Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 11

Union Operation – cnt


• If we want to retain all duplicates, we must write union all in place of union:

The number of duplicate tuples in


the result is equal to the total
number of duplicates that appear
in both sets (subqueries).

Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 12

6
Intersect Operation
• To find the set of all courses taught in both the Fall 2017 and Spring 2018, we write:

• If we want to retain all duplicates, we must write intersect all in place of intersect:

The number of duplicate tuples


that appear in the result is equal
to the minimum number of
duplicates both sets (subqueries).

MySQL does not implement


the intersect operation.
Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 13

Except Operation
• To find all courses taught in the Fall 2017 semester but not in the Spring 2018
semester, we write:

• The except operation outputs all tuples from its first input that do not occur in the
second input; that is, it performs set difference.
• Some SQL implementations, notably Oracle, use the keyword minus in place of except.
MySQL does not implement it at all.
• If we want to retain duplicates, we must write except all in place of except.

Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 14

7
Null Values
• Null values present special problems in relational operations, including arithmetic
operations, comparison operations, and set operations.

• The result of an arithme c expression (involving, for example, +, −, ∗, or ∕ ) is null if


any of the input values is null.

• For example, if a query has an expression r.A + 5, and r.A is null for a particular tuple,
then the expression result must also be null for that tuple.

• Comparisons involving nulls are more of a problem.

• For example, consider the comparison “1 < null”. It would be wrong to say this is true
since we do not know what the null value represents. But it would likewise be wrong
to claim this expression is false; if we did, “not (1 < null)” would evaluate to true,
which does not make sense.

• SQL therefore treats as unknown the result of any comparison involving a null value
(other than predicates is null and is not null, which are described later in this lecture).
• This creates a third logical value in addition to true and false.

Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 15

Null Values – cnt


• Since the predicate in a where clause can involve Boolean operations such as and, or,
and not on the results of comparisons, the definitions of the Boolean operations are
extended to deal with the value unknown.
o and: The result of true and unknown is unknown, false and unknown is false, while
unknown and unknown is unknown.
o or: The result of true or unknown is true, false or unknown is unknown, while unknown
or unknown is unknown.
o not: The result of not unknown is unknown.

• You can verify that if r.A is null, then “1 < r.A” as well as “not (1 < r.A)” evaluate to
unknown.

 Ifthe where clause predicate evaluates to either false or unknown for a tuple, that
tuple is not added to the result.
Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 16

8
Null Values – cnt
• SQL uses the special keyword null in a predicate to test for a null value.
• Thus, to find all instructors who appear in the instructor relation with null values for
salary, we write:

• The predicate is not null succeeds if the value on which it is applied is not null.

• SQL allows us to test whether the result of a comparison is unknown, rather than true
or false, by using the clauses is unknown and is not unknown. For example,

Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 17

Aggregate Functions
• Aggregate functions are functions that take a collection (a set or multiset) of values as
input and return a single value. SQL offers five standard built-in aggregate functions:
o Average: avg

o Minimum: min

o Maximum: max

o Total: sum

o Count: count

• The input to sum and avg must be a collection of numbers, but the other operators
can operate on collections of nonnumeric data types, such as strings, as well.

Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 18

9
Basic Aggregation
• Consider the query “Find the average salary of instructors in the Computer Science
department.” We write this query as follows:

• The result of this query is a relation with a single attribute containing a single tuple
with a numerical value corresponding to the average salary of instructors in the
Computer Science department.

• The database system may give an awkward name to the result relation attribute that
is generated by aggregation, consisting of the text of the expression; however, we can
give a meaningful name to the attribute by using the as clause as follows:

Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 19

Basic Aggregation – cnt


• There are cases where we must eliminate duplicates before computing an aggregate
function.
• If we do want to eliminate duplicates, we use the keyword distinct in the aggregate
expression.
• An example arises in the query “Find the total number of instructors who teach a
course in the Spring 2018 semester.”
• In this case, an instructor counts only once, regardless of the number of course
sections that the instructor teaches. The required information is contained in the
relation teaches, and we write this query as follows:

• Because of the keyword distinct preceding ID, even if an instructor teaches more than
one course, she is counted only once in the result.

Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 20

10
Basic Aggregation – cnt
• We use the aggregate function count frequently to count the number of tuples in a
relation.
• The notation for this function in SQL is count (*).
• Thus, to find the number of tuples in the course relation, we write

• SQL does not allow the use of distinct with count (*).
• It is legal to use distinct with max and min, even though the result does not change.
• We can use the keyword all in place of distinct to specify duplicate retention, but
since all is the default, there is no need to do so.
Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 21

Aggregation with Grouping


• There are circumstances where we would like to apply the aggregate function not only
to a single set of tuples, but also to a group of sets of tuples; we specify this in SQL
using the group by clause.
• The attribute or attributes given in the group by clause are used to form groups.
• Tuples with the same value on all attributes in the group by clause are placed in one
group.
• As an illustration, consider the query “Find the average salary in each department.”
We write this query as follows:

Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 22

11
Aggregation with Grouping – cnt
• In contrast, consider the query “Find the average salary of all instructors.” We write
this query as follows:
In this case the group by clause has
been omitted, so the entire relation
is treated as a single group.

• As another example of aggregation on groups of tuples, consider the query “Find the
number of instructors in each department who teach a course in the Spring 2018
semester.”
• Information about which instructors teach which course sections in which semester is
available in the teaches relation. However, this information has to be joined with
information from the instructor relation to get the department name of each
instructor. Thus, we write this query as follows:

Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 23

Aggregation with Grouping – cnt


• When an SQL query uses grouping, it is important to ensure that the only attributes
that appear in the select statement without being aggregated are those that are
present in the group by clause.
• In other words, any attribute not present in the group by clause may appear in the
select clause only as an argument to an aggregate function; otherwise, the query is
considered erroneous.
• For example, the following query is erroneous since ID does not appear in the group
by clause, and yet it appears in the select clause without being aggregated:

The preceding query also illustrates a comment


written in SQL by enclosing text in “/* */”; the
same comment could have also been written as
“– – erroneous query”.

• In the preceding query, each instructor in a particular group (defined by dept_name)


can have a different ID, and since only one tuple is output for each group, there is no
unique way of choosing which ID value to output. As a result, such cases are
disallowed by SQL.
Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 24

12
Having Clause
• At times, it is useful to state a condition that applies to groups rather than to tuples.
• For example, we might be interested in only those departments where the average
salary of the instructors is more than $42,000.
• This condition does not apply to a single tuple; rather, it applies to each group
constructed by the group by clause.
• To express such a query, we use the having clause of SQL.
• SQL applies predicates in the having clause after groups have been formed, so
aggregate functions may be used in the having clause.
• We express this query in SQL as follows:

• As was the case for the select clause, any attribute present in the having clause
without being aggregated must appear in the group by clause; otherwise, the query is
erroneous.
Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 25

Having Clause – cnt


• The meaning of a query containing aggregation, group by or having clauses, is defined
by the following sequence of operations:
1. As was the case for queries without aggregation, the from clause is first evaluated to
get a relation.
2. If a where clause is present, the predicate in the where clause is applied on the result
relation of the from clause.
3. Tuples satisfying the where predicate are then placed into groups by the group by
clause if it is present. If the group by clause is absent, the entire set of tuples satisfying
the where predicate is treated as being in one group.
4. The having clause, if it is present, is applied to each group; the groups that do not
satisfy the having clause predicate are removed.
5. The select clause uses the remaining groups to generate tuples of the result of the
query, applying the aggregate functions to get a single result tuple for each group.

Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 26

13
Having Clause – cnt
• To illustrate the use of both a having clause and a where clause in the same query, we
consider the query “For each course section offered in 2017, find the average total
credits (tot_cred) of all students enrolled in the section, if the section has at least 2
students.”

• Note that all the required information for the preceding query is available from the
relations takes and student, and that although the query pertains to sections, a join
with section is not needed.
Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 27

Aggregation with Null and Boolean Values


• Null values, when they exist, complicate the processing of aggregate operators.
• For example, assume that some tuples in the instructor relation have a null value for
salary. Consider the following query to total all salary amounts:

• The values to be summed in the preceding query include null values, since we
assumed that some tuples have a null value for salary.
• Rather than say that the overall sum is itself null, the SQL standard says that the sum
operator should ignore null values in its input.

• In general, aggregate functions treat nulls according to the following rule: All
aggregate functions except count (*) ignore null values in their input collection.
• As a result of null values being ignored, the collection of values may be empty. The
count of an empty collection is defined to be 0, and all other aggregate operations
return a value of null when applied on an empty collection.

Lecture-4 COMP337 by Dr. Ferhun Yorgancıoğlu 28

14

You might also like