0% found this document useful (0 votes)
17 views68 pages

RelationalModel

The document discusses concepts related to the relational model of data. It defines key terms like relation, tuple, attribute, domain, schema, and others. It also explains concepts like candidate keys, foreign keys, and relational algebra operations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views68 pages

RelationalModel

The document discusses concepts related to the relational model of data. It defines key terms like relation, tuple, attribute, domain, schema, and others. It also explains concepts like candidate keys, foreign keys, and relational algebra operations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

Relational Model Concepts

 The relational Model of Data is based on the


concept of a Relation.
 A Relation is a mathematical concept based on
the ideas of sets.
 The strength of the relational approach to data
management comes from the formal
foundation provided by the theory of relations
 Mathematical relations
 Set theory
 First order logic
Relational Model Concepts
 The model was first proposed by Dr. E.F. Codd
of IBM in 1970 in the following paper:
"A Relational Model for Large Shared Data
Banks," Communications of the ACM, June
1970.
INFORMAL DEFINITIONS
 RELATION: A table of values

 A relation may be thoughtof as a set of tuples (rows).


 A relation may alternately be though of as a set of attributes
(columns).
 Each row represents a fact that corresponds to a real-world entity
or relationship.
 Each row has a value of an item or set of items that uniquely
identifies that row in the table.
 Sometimes row-ids or sequential numbers are assigned to identify
the rows in the table.
 Each column typically is called by its column name or column
header or attribute name.
FORMAL DEFINITIONS
 The Schema of a Relation: R (A1, A2, .....An)
Relation schema R is defined over attributes A1, A2,
.....An
For Example -
CUSTOMER (Cust-id, Cust-name, Address, Phone#)

Here, CUSTOMER is a relation defined over the four


attributes Cust-id, Cust-name, Address, Phone#, each of
which has a domain or a set of valid values. For example,
the domain of Cust-id is 6 digit numbers.
FORMAL DEFINITIONS
 A tuple is an ordered set of values
 Each value is derived from an appropriate domain.
 Each row in the CUSTOMER table may be referred to as a
tuple in the table and would consist of four values.
 <632895, "John Smith", "101 Main St. Atlanta, GA 30332", "(404) 894-2000">
is a tuple belonging to the CUSTOMER relation.
 A relation may be regarded as a set of tuples (rows).
 Columns in a table are also called attributes of the
relation.
DEFINITIONS
 Definition of Relation
Every relation consists of a head and a body. The
heading of a relation is a tuple heading, i.e., the
complete set of attributes, where every attribute has a
name and a type. The body of a relation is the set of
tuples all having that heading
 The count of tuples is the cardinality of the relation
 The count of attributes is the degree
FORMAL DEFINITIONS
 A domain has a logical definition: e.g.,
“USA_phone_numbers” are the set of 10 digit phone
numbers valid in the U.S.
 A domain may have a data-type or a format defined for it.
The USA_phone_numbers may have a format: (ddd)-ddd-
dddd where each d is a decimal digit. E.g., Dates have
various formats such as monthname, date, year or yyyy-
mm-dd, or dd mm,yyyy etc.
 An attribute designates the role played by the domain.
E.g., the domain Date may be used to define attributes
“Invoice-date” and “Payment-date”.
FORMAL DEFINITIONS
 The relation is formed over the cartesian product of the
sets; each set has values from a domain; that domain is
used in a specific role which is conveyed by the attribute
name.
 Definition of Relation
The relation R is the subset of the cartesian product of the
domain sets dom(A1),…, dom(An), that define R.
Given R(A1, A2, .........., An)
r(R)  dom (A1) X dom (A2) X ....X dom(An)
 R: schema of the relation
 r of R: a specific "value" or population of R.
 R is also called the intension of a relation
 r is also called the extension of a relation
FORMAL DEFINITIONS
 Let S1 = {0,1}
 Let S2 = {a,b,c}

 Let R  S1 X S2

 Then for example: r(R) = {<0,a> , <0,b> , <1,c> }


is one possible “state” or “population” or
“extension” r of the relation R, defined over
domains S1 and S2. It has three tuples.
DEFINITION SUMMARY
Informal Terms Formal Terms

Table Relation
Column Attribute/Domain
Row Tuple
Values in a column Domain
Table Definition Schema of a Relation
Populated Table Extension
Example
Properties OF RELATIONS
 Every tuple contains exactly one value for each attribute
(atomic)
 The tuples are not considered to be ordered, even
though they appear to be in the tabular form.
 There is no ordering of the attributes
 There are no duplicate tuples
Relations vs. Tables
 A relation cannot have a duplicate tuple by definition;
tables, if ill-managed, can
 Rows and columns are ordered, where tuples and
attributes are not
 A table must be created with at least one attribute; a
relation needs none
 Tables may contain nulls, a relation cannot
 A special null value is used to represent values that are
unknown or inapplicable to certain rows
Candidate Key
Keys (Super, Candidate, Primary, Alternate)
 Superkey of R: A set of attributes SK of R such that
no two tuples in any valid relation instance r(R) will
have the same value for SK. That is, for any distinct
tuples t1 and t2 in r(R), t1[SK]  t2[SK] (Uniqueness
property).
 Candidate Key of R: A "minimal" superkey; that is, a
superkey K such that removal of any attribute from K
results in a set of attributes that is not a superkey.
Example: The CAR relation schema:
CAR(State, Reg#, SerialNo, Make, Model, Year)
has two candidate keys Key1 = {State, Reg#}, Key2 =
{SerialNo}, which are also superkeys. {SerialNo,
Make} is a superkey but not a candidate key.
 If a relation has several candidate keys, one is
chosen arbitrarily to be the primary key. Other
candidate keys are called Alternate keys.
Supplier-part Database
Foreign key
 A relation schema may have an attribute that
corresponds to the primary key of another relation.
The attribute is called a foreign key.
Operations in Relational Model
 Relational Algebra
 Relational Calculus
 Tuple relational calculus
 Domain relational calculus
Relational Algebra

 The relational algebra is a collection of operators that


take relations as their operands and return a relation as
their result
 Eight operators, in two groups of four
 Set operators- Union, intersect, difference, Cartesian
product
 Special operators- Restrict, project, join, divide
 The set of possible relational operators is essentially
unlimited
Closure Property

 The output from any relational operator is another


relation: the closure property
 Relational expressions can be nested (analogously to
arithmetic expressions)
 Every relation has a head and a body; relational
algebra must address both
 RENAME operator changes the name of an attribute
without changing its type or content
Union

 Union operates on two relations and returns a relation


that contains all elements belonging to either
 Both relations must be of the same type – all attributes
names and corresponding types must be same- known
as union compatibility
 Relations cannot have duplicate tuples; we say loosely
that UNION “eliminates duplicates”
Union Operation – Example
 Relations r, s: A B A B

 1  2
 2  3
 1 s
r

A B

n Symbolic form- r  s  1
n r UNION s
 2
 1
 3
Intersect

 Intersect operates on two sets and returns a set that


contains all tuples belonging to both
 For Intersect, the sets operated upon must be of the
same type - known as union compatibility
Example: Intersect
 Relations r, s: A B A B

 1  2
 2  3
 1 s
r

n Symbolic form- r ∩ s
A B
n r INTERSECT s
 1
 2
 1
 3
Difference
 Difference operates on two sets and returns a set
containing all tuples occuring in one but not the other,
using MINUS
 For Difference, the sets operated upon must be of the
same type - known as union compatibility
Example: Difference
 Relations r, s: A B A B

 1  2
 2  3
 1 s
r

n Symbolic form- r – s
A B
n r MINUS s
 1
 1
Cartesian Product
 All attribute names must be different
 A Cartesian Product is the set of all ordered pairs such
that in each pair, the first element comes from the first
set, and the second element comes from the second set
 However, since the result of a relational operator is a
relation, the result of each pair is a single tuple
containing all the elements of both of the source
tuples
 Uses keyword TIMES
Example- Cartesian product
n Relations r, s:
A B C D E

 1  10 a
 10 a
 2  20 b
r  10 b
n r TIMES s s
n Symbolic form- r x s
A B C D E
 1  10 a
 1  10 a
 1  20 b
 1  10 b
 2  10 a
 2  10 a
 2  20 b
 2  10 b
Restrict

 Returns those tuples which satisfies specified


condition
 Yields a horizontal subset – a/k/a “SELECT”
a WHERE p
 p is called the restriction condition
 p is a predicate, and returns boolean
Select/ Restrict Operation – Example
n Relation r

A B C D

  1 7
  5 7
r WHERE ((A=B) AND (D>5))
  12 3
 Symbolic form- A=B ^ D > 5 (r)   23 10

A B C D

  1 7
  23 10
Project
 Returns specified columns
 Yields a vertical subset
 The general form is a commalist of attributes to be
kept in the result
 All attributes are kept, all tuples are kept
 An alternative specification is to name the attributes to
be excluded:
P { ALL BUT WEIGHT}
Project Operation – Example
 Relation r: A B C

 10 1
 20 1
 30 1
 40 2

r {A, C}
Symbolic form- A,C (r) A C A C

 1  1
 1 =  1
 1  2
 2
Join – Natural Join

 When unqualified, join means “natural join”


 For any two relations r and s with at least one
matching attribute, the join operator returns a relation
with
 Heading which is the union of heading of r and s and
 body consists of all combined tuples of r and s such that
the value of common attributes is same in the tuples of r
and s
 Attributes that do not match from each source relation
are retained
 If no common attribute, result is a Cartesian product
 If all attributes are common, result is an Intersect
Natural Join Example
 Relations r, s:
A B C D B D E

 1  a 1 a 
 2  a 3 a 
 4  b 1 a 
 1  a 2 b 
 2  b 3 b 
r NATURAL JOIN s r s

n r s A B C D E
 1  a 
 1  a 
 1  a 
 1  a 
 2  b 
Join – Theta Join
 Used to join relations based on matching attributes,
where the values are not equal
 Given relations a and b, and attributes X and Y, this
can be expressed as follows:
(a TIMES b) WHERE X theta Y
a (X theta Y) b
 When theta is set to = the result can be made to be
that of natural join (project away the duplicate
attribute, and rename the kept one)
Divide - Example
 S be a relation of suppliers, P one of parts, and SP the
mediator
 S JOIN ( S {S#} DIVIDEDBY P {p#}
PER SP {S#, P#} )
Will return a relation with suppliers who supply all
parts, only
Division- symbolic form
 Notation: c  b
 c = (A1, …, Am , B1, …, Bn )
 b = (B1, …, Bn)
The result of c  b is a relation on schema
c – b = (A1, …, Am)
c  b = { t | t   c-b (c)   u  b ( tu  c ) }
Where tu means the concatenation of tuples t
and u to produce a single tuple
Division Operation – Example
n Relations r, s:
A B
B
 1 1
 2
 3 2
 1 s
 1
 1
 3
 4
 6
n r  s: A  1
 2
 r

Another Division Example
n Relations r, s:
A B C D E D E

 a  a 1 a 1
 a  a 1 b 1
 a  b 1 s
 a  a 1
 a  b 3
 a  a 1
 a  b 1
 a  b 1
r
n r  s:
A B C

 a 
 a 
Relational Algebra
 basic operators
 select: 
 project: 
 union: 
 set difference: –
 Cartesian product: x
 rename: 
 All other operators (intersect, join, divide) can be
specified in terms of basic operators
Division Operation
 Property
 Let q = r  s
 Then q is the largest relation satisfying q x s  r
 Definition in terms of the basic algebra operation
Let r(R) and s(S) be relations, and let S  R

r  s = R-S (r ) – R-S ( ( R-S (r ) x s ) – r )

To see why
 R-S (r) simply reorders attributes of r

 R-S (R-S (r ) x s ) – R-S,S(r) ) gives those tuples t in

R-S (r ) such that for some tuple u  s, tu  r.


Aggregate Functions and Operations
 Aggregation function takes a collection of values and returns a
single value as a result.
avg: average value
min: minimum value
max: maximum value
sum: sum of values
count: number of values
 Aggregate operation in relational algebra

G1 ,G2 ,,Gn F ( A ), F ( A ,, F ( A ) (r )


1 1 2 2 n n

r is any relational-algebra expression


 G1, G2 …, Gn is a list of attributes on which to group (can be
empty)
 Each Fi is an aggregate function
 Each Ai is an attribute name
Aggregate Operation – Example
 Relation
r: A B C

  7
  7
  3
  10

n g sum(c) (r) sum(c )

27
Aggregate Operation – Example
 Relation account grouped by branch-name:
branch_name account_number balance
Perryridge A-102 400
Perryridge A-201 900
Brighton A-217 750
Brighton A-215 750
Redwood A-222 700

branch_name g sum(balance) (account)

branch_name sum(balance)
Perryridge 1300
Brighton 1500
Redwood 700
What is the Algebra for?

 The purpose of the algebra is to allow the writing of


relational expressions
 Applications of the algebra: retrieval, update, defining
integrity constraints, derived relvars, stability and
security
 An implemented language can be said to be
relationally complete if it is at least as powerful as the
algebra
Relational Calculus
 Based on predicate calculus, the relational calculus is a
more natural language expression of the relational
algebra
 Calculus specify ‘What’ instead of ‘How’
 Instead of operators used by the system to construct a
result relation, the calculus offers a notation to express
the result relation in terms of the source relations
 Relational calculus and algebra are logically equivalent
 Every statement that can be written in calculus can also
be written in algebra and vice versa
Relational Calculus Implementations

 Codd proposed a language called ALPHA to


implement the relational calculus
 QUEL, an early competitor to SQL, was based on
ALPHA
 Relational calculus has two forms
 Tuple calculus and domain calculus
 Both are equivalent
 Tuple calculus has been implemented by QUEL
 Domain calculus has been implemented by ‘Query By
Example’ (QBE)
Tuple Calculus
 Tuple Range Variable is a variable which ranges over
the tuples of a relation i.e. at any time the value of
range variable defined over a relation is any tuple of
that relation
S(SX) – Range variable SX ranges over the tuples of
relation S
Quantifiers - EXISTS
 EXISTS is the existential quantifier
 EXISTS V ( p ) -- There exists at least one value of V
that makes p true
 Formally this is an iterated OR
FALSE OR p (t1) OR … OR p (tm) where ti is the values
of V
-- will evaluate to false if m = 0
Quantifiers - FORALL
 FORALL is the universal quantifier
 FORALL V ( p ) -- for all values of V, p is true
 Formally this is an iterated AND
TRUE AND p (t1) AND … AND p (tm)
-- will evaluate to true as long as all are true
 This will evaluate to TRUE when the set is empty
Free and Bound Variables
 Every reference to a range variable is either free or
bound
 A reference to a range variable which is free in
predicate p is bound in predicate Exist (p) and Forall
(p)
Well Formed Formulas (WFFs)
 R(RX)
 RX.a o SX.b where o is <,> = etc.
 RX.a o C
 F1 AND F2
 F1 OR F2
 NOT F
 Exist RX (p)
 Forall RX (p)
Calculus Expression
RX | WFF
Only those references of range variable can be free at
the right side of vertical bar (in WFF) which are
specified at the left side of vertical bar
All other references must be bound to some quantifier
Relational Integrity Constraints
 Integrity means correctness
 Constraints are conditions that must hold on all
valid relation instances.
 Constraints should be checked before attempting any
insert or update or delete operation
 If any constraint is violated, operation must be
aborted, otherwise database would be in an
inconsistent state
Types
 Special Integrity Constraints
1. Key constraints
2. Entity integrity constraints
3. Referential integrity constraints
 General Integrity Constraints
Key Constraint
 If the values of some attribute are unique, that
attribute can be defined as (candidate) key attribute
i.e. this attribute can be used as a (primary) key
Entity Integrity
 Relational Database Schema: A set S of relation
schemas that belong to the same database. S is the
name of the database.
S = {R1, R2, ..., Rn}
 Entity Integrity: The primary key attributes PK of
each relation schema R in S cannot have null values in
any tuple of r(R). This is because primary key values are
used to identify the individual tuples.
t[PK]  null for any tuple t in r(R)
 Note: Other attributes of R may be similarly
constrained to disallow null values, even though they
are not members of the primary key.
Referential Integrity Constraint
The database must not have any unmatched foreign key
values.
The value in the foreign key column (or columns) FK of
the referencing relation R1 can be either:
(1) a value of an existing primary key value of the
corresponding primary key PK in the referenced
relation R2,, or..
(2) a null.
In case (2), the FK in R1 should not be a part of its
own primary key.
General Integrity Constraints
 A general constraint is any user specified constraint
 Type of constraints
 Static Constraints
 Type, attribute, relation, database (Relation and database
constraints inherit constraints from attributes and add more
constraints in addition)
 Transition Constrain
Type constraints
 Type constraints check format and values immediately
 Definition of a domain (or Type) is a type constraint
Attribute constraints
 Attribute constraints are inherited from those of the
declared type
 i.e. definition of an attribute is an attribute constraint
 E.g. status INT(5)
Relation constraints
 inherit constraints from attributes and add business
rule constraints in addition
 Involves a single relation
 Ex All suppliers in London must have status 20
 Ex Red parts must be stored in Paris
Database constraints
 inherit constraints from attributes and add business
rule constraints in addition
 Involves more than one relation
 Ex No supplier with status less than 20 can supply any
part in a quantity greater than 500.
 Ex All Paris supplier must supply a red part in quantity
less than 200.
Transition Constraints
 General constraints discussed so far are static in nature
i.e. they check only correct state
 Sometimes state are correct, but transition between
two correct state may be incorrect
 Ex The status of a supplier must not decrease
 Transition constraints constrain certain actions
 Transition constraints can apply to a database or a
relation, but not an attribute, or a type
Transition Constraints
 Example
 Correct marital status are- M, NM, W, D
 Correct Transitions
 NM to M, M to W, M to D, W to M
 Incorrect Transitions
 NM to W, NM to D, W to D, D to W

You might also like