Algebra
Algebra
IT University of Copenhagen
October 7, 2003
1
What you should remember from previously
2
Next: Basics of relational algebra.
What is an algebra?
Example:
In the algebra of arithmetic, the atomic operands are constants and
variables, and the operators are +, -, /, and ·.
Using these we can form expressions like ((x + 7)/(y − 3)) + x.
4
Relational algebra
σa≥5 (R1 ./ R2 ) ∪ R3
5
Why is relational algebra useful?
6
Recap of set notation
Examples:
• The set of negative integers: {x | x ∈ Z (the set of integers), x < 0}.
• The set of two-tuples of strings:
{(x, y) | x is a string, and y is a string}.
7
Selection in relational algebra
σC (R)
corresponds in SQL to
SELECT *
FROM R
WHERE C
8
Projection in relational algebra
In relational algebra, the operator πA1 ,...,An is used for projection onto
attributes A1 , . . . , An .
Formally:
9
Projection in relational algebra and SQL
corresponds in SQL to
SELECT A1 , . . . , An
FROM R
10
Set operators in relational algebra
Since relations are sets, we can apply the standard set operators.
• Union: R1 ∪ R2 = {x | x ∈ R1 or x ∈ R2 }.
• Intersection: R1 ∩ R2 = {x | x ∈ R1 and x ∈ R2 }.
• Difference: R1 − R2 = {x | x ∈ R1 and x 6∈ R2 }.
11
Problem session (5 minutes)
Try to come up with a formal definition of the natural join operation, i.e.,
the join operation used to combine decomposed relation instances.
12
Natural join in relational algebra
R1 ./ R2 = {(a1 , . . . , an , b1 , . . . , bm , c1 , . . . , ck )
| there exists t ∈ R1 and u ∈ R2 with
values a1 , . . . , an on attributes A1 , . . . , An of t and u,
values b1 , . . . , bm on attributes B1 , . . . , Bm of t,
and values c1 , . . . , ck on attributes C1 , . . . , Ck of u}
13
Natural join in relational algebra and SQL
R1 ./ R2
corresponds in SQL to
R1 NATURAL JOIN R2
14
Next: Relational algebra on bags and commercial RDBMSs.
Relational algebra vs SQL
16
Relations in SQL are bags
What we have seen is that relations in SQL are bags (or multisets), i.e.,
tuples may appear more than once.
The fact that the same attribute may occur several times is a different (and
less important) issue that we won’t go into.
17
Features of relational algebra on bags
Relational algebra on bags is basically the same as relational algebra (on
sets), without duplicate elimination.
• πA1 ,...,An (R) has one tuple for each tuple of R, even if the tuples
become identical when some attributes are removed.
• σC (R) contains all tuples of R satisfying C, including duplicates.
• A tuple occurs x · y times in R1 ./ R2 if it was formed by combining a
tuple occurring x times in R1 with a tuple occurring y times in R2 .
• R1 ∪ R2 contains all tuples of R1 and R2 , including duplicates.
(This corresponds to UNION ALL in SQL.)
• R1 ∩ R2 and R1 − R2 can also be defined – see book for details.
• A new duplicate elimination operator: δ(R) is the set of (different)
tuples occurring in the bag R.
18
Why bags?
The reason for using bags (rather than sets, which are easier to handle) is
database efficiency.
The reason why bags are used is that duplicate elimination is relatively
costly (requires time and memory), so it is generally an advantage to use it
only when necessary.
19
Duplicate elimination in SQL
20
Problem session (5-10 minutes)
Suppose that R and S are relations with one common attribute, A, and
consider the expression
21
Next: More relational algebra (and SQL).
Other kinds of join
23
Aggregation operators
Aggregation operators are used to compute facts about the values of some
attribute in a relation.
The standard aggregation operators are: SUM, AVG, MIN, MAX, and COUNT,
computing respectively the sum, average, minimum, maximum and number
of the attribute values.
24
Grouping and aggregation
The tuples in a relation are divided into groups based on the values of a
specified set of grouping attributes, and the aggregate is computed for
each group.
25
Semantics of aggregation
When computing an aggregate, we get one tuple for each list of values of
the grouping attributes. In addition to the grouping attributes, the tuple
contains the aggregate value(s) for the group.
26
Aggregate conditions on groups
The HAVING clause may contain conditions like MIN(year) < 1930, where
MIN(year) is the minimum value of the year attribute within the group.
27
Next: Algebraic laws.
Laws in relational algebra
Using such laws we could, e.g., conclude that the following two relational
algebra expressions are equivalent:
29
Basic laws in relational algebra
30
Problem session (5 minutes)
(R ./ S) ./ T = R ./ (S ./ T )
In other words: Argue that the equality holds for any relations R, S, and T .
31
Relational algebraic law school, continued
is equivalent to ((σC1 (R1 )) ./ (σC2 (R1 ))) ∪ (R2 ./ (σC2 (R1 ))).
32
An algebraic criterion for decomposition
R = (πA1 ,...,An ,B1 ,...,Bm (R)) ./ (πA1 ,...,An ,C1 ,...,Ck (R))
This is the formal way of stating that the relation instances of R1 and R2 ,
derived from R by projection, should always join to form R.
33
Most important points in this lecture
34
Next lecture (in two weeks)
35