0% found this document useful (0 votes)
14 views41 pages

L2 Relations

Uploaded by

lajimail3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views41 pages

L2 Relations

Uploaded by

lajimail3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 41

Relations and Relational Algebra

CSCI1270, Fall 2009, Lecture 1


Introduced by Ted Codd (late 60’s – early 70’s)
● Before = “Network Data Model” (Cobol as DDL,
DML)
● Very contentious: Database Wars
Relational data model
contributes:
1.Separation of logical and physical data models (data
independence)
2.Declarative query languages
3.Formal semantics
First 4.Query optimization (key to commercial success)
● Ingres (UC Berkeley)
prototypes:
● System R (IBM)•
DB2
CSCI1270, Fall 2008, Lecture 2
bname acct_no balance
Downtown A-101 500
Account =
Brighton A-201 900
Brighton A-217 500
Table name

Attribute
names
Terms:
● Tables • Relations
● Columns • Attributes
● Rows • Tuples
● Schema (e.g.: Acct_Schema = (bname, acct_no,
balance))
CSCI1270, Fall 2008, Lecture 2
Mathematical
relations
Given sets: R = {1, 2, 3}, S = {3, 4}
● R  S = { (1, 3), (1, 4), (2, 3), (2, 4), (3, 3), (3, 4) }
● A relation on R, S is any subset () of R  S (e.g: { (1, 4), (3, 4)})

Database
Given attribute domains
relations
Branches = { Downtown, Brighton, … }
Accounts = { A-101, A-201, A-217, … }
Balances = R – real numbers
Account  Branches  Accounts  Balances
{ (Downtown, A-101, 500),
(Brighton, A-201, 900),
(Brighton, A-217, 500) }
CSCI1270, Fall 2008, Lecture 2
bname acct_no balance
Downtown A-101 500
Account =
Brighton A-201 900
Brighton A-217 500

Considered equivalent to…


{ (Downtown, A-101, 500),
(Brighton, A-201, 900),
(Brighton, A-217, 500) }

Relational database semantics are defined in


terms of mathematical relations

CSCI1270, Fall 2008, Lecture 2


Account Branch
bname acct_no balance bname bcity assets

Depositor
cname acct_no Borrower
cname lno

Customer
cname cstreet ccity
Loan
bname lno amt

CSCI1270, Fall 2008, Lecture 2


Account Branch
bname acct_no balance bname bcity assets
Downtown A-101 500 Downtown Brooklyn 9M
Mianus A-215 700 Redwood Palo Alto 2.1M
Perry A-102 400 Perry Horseneck 1.7M
R.H. A-305 350 Mianus Horseneck 0.4M
Brighton A-201 900 R.H. Horseneck 8M
Redwood A-222 700 Pownel Bennington 0.3M
Brighton A-217 750 N. Town Rye 3.7M
Brighton Brooklyn 7.1M
Depositor
cname acct_no Borrower
Johnson A-101 cname lno
Smith A-215
Hayes A-102 Jones L-17
Turner A-305 Smith L-23
Johnson A-201 Hayes L-15
Jones A-217 Jackson L-14
Lindsay A-222 Curry L-93
Smith L-11
Customer Williams L-17
Adams L-16
cname cstreet ccity
Loan
Jones Main Harrison
Smith North Rye bname lno amt
Hayes Main Harrison
Downtown L-17 1000
Curry North Rye
Redwood L-23 2000
Lindsay Park Pittsfield
Perry L-15 1500
Turner Putnam Stanford
Downtown L-14 1500
Williams Nassau Princeton
Mianus L-93 500
Adams Spring Pittsfield
R.H. L-11 900
Johnson Alma Palo Alto
Perry L-16 1300
Glenn Sand Hill Woodside
Brooks Senator Brooklyn
Green Walnut Stanford

CSCI1270, Fall 2008, Lecture 2


Kinds of
● 1. Superkeys
keys ○ set of attributes of table for which every row has distinct set of values
○ in example below: (bname,bcity) is a superkey
● 2. Candidate keys
○ “minimal” superkeys
● 3. Primary keys
○ DBA-chosen candidate keys

Act as Integrity
i.e., guard against illegal/invalid instance of given
Constraints
schema
e.g., Branch
bname = (bname, assets 
bcity bcity, assets)
Brighton Brooklyn 5M
Brighton Boston 3M
Invalid!!

CSCI1270, Fall 2008, Lecture 2


Recall: Query = “Retrieval Program”
Language
Examples:
Theoretical:
1.Relational Algebra
2.Relational Calculus
1.Tuple Relational Calculus (TRC)
2.Domain Relational Calculus (DRC)
Practical:
1.SQL (originally: SEQUEL from System R)
2.Quel (used in Ingres)
3.Datalog (Prolog-like – used in research lab systems)

● Theoretical QLs give semantics to Practical QLs

CSCI1270, Fall 2008, Lecture 2


●Basic Operators
1.select ( σ )
2.project (  )
3.union (  )
4.set difference ( – )
5.cartesian product (  )
6.rename ( ρ )

●Closure Property
Relatio
n Relational Relatio
n Relational
Operator Operator
Relatio
n

CSCI1270, Fall 2008, Lecture 2


Notation: σpredicate (Relation)
Relation: Can be name of table, or the result of another query
Predicate:
1.Simple
○ attribute1 = attribute2
○ attribute = constant value (also: ≠, <, >, ≤, ≥)

2. Complex
○ predicate AND predicate
○ predicate OR predicate
○ NOT predicate

Idea
: Select rows from a relation based on a
predicate

CSCI1270, Fall 2008, Lecture 2


Notation: σpredicate (Relation)

bname bcity assets


σ bcity = “Brooklyn” (branch) = Downtown Brooklyn 9M
Brighton Brooklyn 7.1M

σ assets > $8M (σ bcity = “Brooklyn” (branch)) =


bname bcity assets
Downtown Brooklyn 9M

CSCI1270, Fall 2008, Lecture 2


Notation: A1, …, An (Relation)
● Relation: name of a table or result of another
query
● Each Ai is an attribute
● Idea:  selects columns (vs. σ which selects rows)
cstreet ccity
 cstreet, ccity (customer) = Main Harrison
North Rye
Park Pittsfield
Putnam Stanford
Nassau Princeton
Spring Pittsfield
Alma Palo Alto
Sand Hill Woodside
Senator Brooklyn
Walnut Stanford

CSCI1270, Fall 2008, Lecture 2


Notation: A1, …, An (Relation)
● Each Ai an attribute
● Idea:  selects columns (vs. σ which selects
rows)

bcity
bcity (σassets > 5M (branch))
Brooklyn
= Horseneck

CSCI1270, Fall 2008, Lecture 2


Notation: Relation1  Relation2
R  S valid only if:

1.R, S have same number of columns (arity)


2.R, S corresponding columns have same domain (compatibility)

Example:
cname
Johnson
(cname (depositor))  (cname (borrower)) = Smith
Depositor Borrower Hayes
cname acct_no cname lno Turner
Jones
Lindsay
Jackson
Curry
Williams
Adams

CSCI1270, Fall 2008, Lecture 2


Notation: Relation1 - Relation2
R - S valid only if:

1.R, S have same number of columns (arity)


2.R, S corresponding columns have same domain (compatibility)

Example:
( bname (σamount ≥ 1000 (loan))) – (bname (σ balance < 800 (account)))
Loan Account

bname lno amt = bname acct_no balance


Downtown A-101 500
Downtown L-17 1000
Mianus A-215 700
Redwood L-23 2000
Perry L-15 1500 bname acct_no balance Perry A-102 400
R.H. A-305 350
Downtown L-14 1500
Mianus L-93 500 Mianus A-215 700 Brighton A-201 900
Redwood A-222 700
R.H. L-11 900 Brighton A-201 900 Brighton A-217 750
Perry L-16 1300
Redwood A-222 700
Brighton A-217 750

CSCI1270, Fall 2008, Lecture 2


Notation: Relation1 - Relation2
R - S valid only if:

1.R, S have same number of columns (arity)


2.R, S corresponding columns have same domain (compatibility)

Example:
( bname (σamount ≥ 1000 (loan))) – (bname (σ balance < 800 (account)))
bname lno amount
=
bname acct_no balance
bname
Downtown L-17 1000 Mianus A-215 700
Downtown
Redwood L-23 –
2000 Brighton A-201 900
Perry L-15 1500 Redwood
=
A-222 700
Perry

Downtown L-14 1500 Brighton A-217 750


Perry L-16 1300

CSCI1270, Fall 2008, Lecture 2


Notation: Relation1  Relation2
R  S like cross product for mathematical relations:
● every tuple of R appended to every tuple of
S
Example:
depositor. acct_no borrower. lno
depositor  borrower = cname cname
Johnson A-101 Jones L-17
Johnson A-101 Smith L-23
How many tuples in Johnson A-101 Hayes L-15
Johnson A-101 Jackson L-14
the result? Johnson A-101 Curry L-93
Johnson A-101 Smith L-11
Johnson A-101 Williams L-17
A: 56 Johnson A-101 Adams L-16
Smith A-215 Jones L-17
… … … …

CSCI1270, Fall 2008, Lecture 2


Notation:  identifier (Relation)
renames a relation, or

Notation:  identifier0 (identifier1, …, identifiern) (Relation)


renames relation and columns of n-column relation

Use:
massage relations to make , – valid, or  more readable

CSCI1270, Fall 2008, Lecture 2


Notation:  identifier0 (identifier1, …, identifiern) (Relation)
renames relation and columns of n-column relation
Example:
 res (dcname, acctno, bcname, lno) (depositor  borrower) =
depositor. acct_no borrower. lno
cname cname
Johnson A-101 Jones L-17
Johnson A-101 Smith L-23
Johnson A-101 Hayes L-15
Johnson A-101 Jackson L-14
Johnson A-101 Curry L-93
Johnson A-101 Smith L-11
Johnson A-101 Williams L-17
Johnson A-101 Adams L-16
Smith A-215 Jones L-17
… … … …

CSCI1270, Fall 2008, Lecture 2


●Determine lno for loans that are for an amount that
is larger than the amount of some other loan. (i.e.
lno for all non-minimal loans)
Can do in
steps:
Temp1 • …
Temp2 • … Temp1 …

CSCI1270, Fall 2008, Lecture 2


1. Find the base data we need
lno amt
L-17 1000
Temp1 • lno,amt (loan) L-23 2000
L-15 1500
L-14 1500
L-93 500
L-11 900
L-16 1300

2. Make a copy of (1)


lno2 amt2
Temp2 • ρ Temp2 (lno2,amt2) (Temp1) L-17 1000
L-23 2000
L-15 1500
L-14 1500
L-93 500
L-11 900
L-16 1300

CSCI1270, Fall 2008, Lecture 2


3. Take the cartesian product of 1 and 2

Temp3 • Temp1  Temp2


lno amt lno2 amt2
L-17 1000 L-17 1000
L-17 1000 L-23 2000
… … … …
L-17 1000 L-16 1300
L-23 2000 L-17 1000
L-23 2000 L-23 2000
… … … …
L-23 2000 L-16 1300
… … … …

CSCI1270, Fall 2008, Lecture 2


4. Select non-minimal loans

Temp4 • σamt > amt2 (Temp3)

5. Project on lno

Result •  lno (Temp4)

… or, if you prefer…


● lno (σamt > amt2 (
○ lno,amt (loan)  (
○ ρTemp2 (lno2,amt2) (lno,amt (loan)))))

CSCI1270, Fall 2008, Lecture 2


Relational Algebra
1.SELECT ( σ )
2.PROJECT ( π )
3.UNION (  )
4.SET DIFFERENCE ( – )
5.CARTESIAN PRODUCT (  )
6.RENAME ( ρ )

● Relational algebra gives semantics to practical query languages


● Above set: minimal relational algebra
• will now look at some redundant (but useful!) operators

CSCI1270, Fall 2008, Lecture 2


Express the following query in the
RA:
Find the names of customers who have both accounts and
loans
T1 • ρT1 (cname2, lno) (borrower)

T2 • depositor  T1
T3 • σcname = cname2 (T2)
Result
Above sequence • π cname (ρ,
of operators (T3,
) σ) very
common.
Motivates additional (redundant) RA operators.

CSCI1270, Fall 2008, Lecture 2


1.Natural Join ( )
2. Division ( )
3. Generalized Projection (π)
4. Outer Joins ( )
5. Update ( • ) (we’ve already been using)
● Redundant: Above can be expressed in terms of minimal RA
• e.g. depositor borrower =
π …(σ…(depositor  ρ…(borrower)))
● Added for convenience
CSCI1270, Fall 2008, Lecture 2
Notation: Relation Relation
1 2
Idea: combines ρ, ,
σ
A B C D A B C D E
E B D
1 α + 10 ‘a’ α 10 1 α + 10 ‘a’
2 α - 10 ‘a’ α 20 = 2 α - 10 ‘a’
2 α - 20 ‘b’ β 10 2 α - 20 ‘a’
3 β + 10 ‘c’ β 10 3 β + 10 ‘b’

r s

deposito borrower
r

πcname,acct_no,lno (σcname=cname2 (depositor  ρt(cname2,lno) (borrower)))

CSCI1270, Fall 2008, Lecture 2


Notation: Relation Relation
1 2
Idea: expresses “for all” queries
A B
α 1
α 23 B A
α 1 1 = αδ
β 1 2
γ 3
γ 4 s
γ 6
γ 1
δ 2
Query: Find values for A in r
δ which have corresponding B
r values for all B values in s
CSCI1270, Fall 2008, Lecture 2

Another way to look at it: and 
17 3 = The largest value of i such

5 that: i  3 ≤ 17
Relational
A B
Division α 1
α 23 B A
α 1 1 α
=
β 1 2 δ
γ 3
s t
γ 4
γ 6
γ 1 The largest value of t such
δ 2 that:
δ
r (tsr)
CSCI1270, Fall 2008, Lecture 2
A More Complex Example

A B C D E
α a α a 1
α a γ a 1 A B C
D E
α a γ b 1 = α ?a γ
β a γ a 1 a 1
γ a γ
β a γ b 3 b 1
γ a γ a 1 s t
γ a γ b 1
γ a β b 1
r

CSCI1270, Fall 2008, Lecture 2


Notation: e1,…,en
(Relation)
e1,…,en can include arithmetic expressions – not just
attributes
Exampl
e cname limit balance

credit = Jones 5000 2000


Turner 3000 2500

Then… cname limit-balance


Jones 3000
π cname, limit - balance (credit) =
Turner 500

CSCI1270, Fall 2008, Lecture 2


● Aggregation function takes a collection of values and
returns a single value as a result.
● avg: average value
min: minimum value
max: maximum value
sum: sum of values
count: number of values
● Aggregate operation in relational algebra

G1, G2, …, Gn g F1( A1), F2( A2),…, Fn( An) (E)


● E is any relational-algebra expression
● G1, G2 …, Gn is a list of attributes on which to group (can be empty)
● Each Fi is an aggregate function
● Each Ai is an attribute name

CSCI1270, Fall 2008, Lecture 2


● Relation r:
A B C

  7
  7
  3
  10

sum-C
g sum(c) (r) 27

CSCI1270, Fall 2008, Lecture 2


● Relation account grouped by branch-name:
branch-name account- balance
Perryridge number
A-102 400
Perryridge A-201 900
Brighton A-217 750
Brighton A-215 750
Redwood A-222 700

branch-name g sum(balance) (account)


branch-name balance
Perryridge 1300
Brighton 1500
Redwood 700
CSCI1270, Fall 2008, Lecture 2
●Result of aggregation does not have a name
○ Can use rename operation to give it a name
○ For convenience, we permit renaming as part of
aggregate operation
branch-name g sum(balance) as sum-balance (account)

CSCI1270, Fall 2008, Lecture 2


Motivation:
bname lno amt cname lno
Downtown L-170 3000 Jones L-170
loan = borrower =
Redwood L-230 4000 Smith L-230
Perry L-260 1700 Hayes L-155

=
bname lno amt cname
loan borrower = Downtown L-170 3000 Jones
Redwood L-230 4000 Smith

Join result loses…


• any record of Perry
• any record of Hayes
CSCI1270, Fall 2008, Lecture 2
bname lno amt cname lno

loan = Downtown L-170 3000 borrower = Jones L-170


Redwood L-230 4000 Smith L-230
Perry L-260 1700 Hayes L-155

1. Left Outer Join ( )


● preserves all tuples in left
relation
loan borrower =

bname lno amt cname


Downtown L-170 3000 Jones
Redwood L-230 4000 Smith ┴ = NULL
Perry L-260 1700 ┴

CSCI1270, Fall 2008, Lecture 2


bname lno amt cname lno

loan = Downtown L-170 3000 borrower = Jones L-170


Redwood L-230 4000 Smith L-230
Perry L-260 1700 Hayes L-155

2. Right Outer Join ( )


● preserves all tuples in right
relation
loan borrower =

bname lno amt cname


Downtown L-170 3000 Jones
Redwood L-230 4000 Smith ┴ = NULL
┴ L-155 ┴ Hayes

CSCI1270, Fall 2008, Lecture 2


bname lno amt cname lno

loan = Downtown L-170 3000 borrower = Jones L-170


Redwood L-230 4000 Smith L-230
Perry L-260 1700 Hayes L-155

3. Full Outer Join ( )


● preserves all tuples in both
relations
loan borrower =
bname lno amt cname
Downtown L-170 3000 Jones
Redwood L-230 4000 Smith
Perry L-260 1700 ┴
┴ L-155 ┴ Hayes ┴ = NULL

CSCI1270, Fall 2008, Lecture 2


Notation: Identifier • Query
Common Uses:
1. Deletion: r • r – s
e.g., account • account – σbname=Perry (account)
(deletes all Perry
accounts)r • r  s
2. Insertion:
e.g., branch • branch  {(Waltham, Boston, 7M)}
(inserts new branch with
bname = Waltham, bcity = Boston, assets = 7M)
e.g., depositor • depositor  (ρtemp (cname,acct_no) (borrower))
(adds all borrowers to depositors, treating lno’s as acct_no’s)
3. Update: r • πe1,…,en (r)
e.g., account • πbname,acct_no,bal*1.05 (account)
(adds 5% interest to account balances)
CSCI1270, Fall 2008, Lecture 2

You might also like