Lectures 2014 Handout5 RelationalModel
Lectures 2014 Handout5 RelationalModel
• The strength of the relational approach to data management comes from the formal foundation
provided by the theory of relations.
• The model was first proposed by Dr. E.F.Codd of IBM in 1970 in the following paper: “A Relational
Model for Large Shared Data Banks”, Communications of the ACM, June 1970.
INFORMAL DEFINITIONS
FORMAL DEFINITIONS
• Relation
– A Relation may be defined in multiple ways.
– The Schema of a Relation: R (A1 , A2 , . . . , An )
– Relation schema R is defined over attributes A1 , A2 , . . . , An
– Each attribute Ai is the name of a role played by some domain D in R
– Domain D is denoted by dom(Ai )
– The degree of a relation is the number of attributes of R
– CUSTOMER (CustID, CustName, Address, Phone)
– Here, CUSTOMER is a relation defined over the four attributes CustID, CustName, Address,
Phone, each of which has a domain or a set of valid values. For example, the domain of
CustID is 6 digit numbers.
• Tuple
– A tuple is an ordered set of values
– Each value is derived from an appropriate domain.
– Each row in the CUSTOMER table may be referred to as a tuple in the table and would
consist of four values. <632895, "John Smith", "101 Main St. Atlanta, GA 30332",
"(404)894-2000"> is a tuple belonging to the CUSTOMER relation.
– A relation may be regarded as a set of tuples (rows).
– Columns in a table are also called attributes of the relation.
• Domain
– A domain has a logical definition:
∗ e.g. “USA phone numbers” are the set of 10 digit phone numbers valid in the U.S.
– A domain may have a data-type or a format defined for it.
29
Figure 18: The attributes and tuples of STUDENT relation. Here null represents values that are unknown
or inapplicable to certain tuples.
∗ The USA phone numbers may have a format: (ddd)-ddd- dddd where each d is a decimal
digit.
∗ E.g., Dates have various formats such as month name, date, year or yyyy-mm-dd, or dd
mm,yyyy etc.
• – The relation is formed over the cartesian product of the sets; each set has values from a domain;
that domain is used in a specific role which is conveyed by the attribute name.
– For example, attribute Cust-name is defined over the domain of strings of 25 characters. The
role these strings play in the CUSTOMER relation is that of the name of customers.
– Formally, Given R (A1 , A2 , . . . , An )
r(R) ⊆ (dom(A1 ) × dom(A2 ) × . . . × dom(An ))
– R: schema of the relation
– r of R: a specific “value” or population of R.
DEFINITION SUMMARY
Informal Terms Formal Terms
Table Relation
Column Attribute/Domain
Row Tuple
Values in a column Domain
Table Definition Schema of a Relation
Populated Table Extension
CHARACTERISTICS OF RELATIONS
• Ordering of tuples in a relation r(R) : The tuples are not considered to be ordered, even though
they appear to be in the tabular form.
• Ordering of attributes in a relation schema R (and of values within each tuple): We will consider
the attributes in R (A1 , A2 , . . . , An ) and the values in t =< v1 , v2 , . . . , vn > to be ordered.
• Values in a tuple : All values are considered atomic (indivisible). A special null value is used to
represent values that are unknown or inapplicable to certain tuples.
Notation:
We refer to component values of a tuple t by t[Ai ] = vi (the value of attribute Ai for tuple t).
Similarly, t[Au , Av , . . . , Aw ] refers to the subtuple of t containing the values of attributes Au , Av , . . . , Aw ,
respectively.
30
Relational Integrity Constraints
• Constraints are conditions that must hold on all valid relation instances. There are three main
types of constraints:
1. Key constraints
2. Entity integrity constraints
3. Referential integrity constraints
Key Constraints
• Superkey of R : A set of attributes SK of R such that no two tuples in any valid relation instance
r(R) will have the same value for SK. That is, for any distinct tuples t1 and t2 in r(R), t1 [SK] 6=
t2 [SK].
• Key of R : A “minimal” superkey; that is, a superkey K such that removal of any attribute from
K results in a set of attributes that is not a superkey.
Example: The CAR relation schema: CAR(State, Reg#, SerialNo, Make, Model, Year)
has two keys Key1 = {State, Reg#}, Key2 = {SerialNo}, which are also superkeys. {SerialNo,
Make} is a super key but not a key.
• If a relation has several candidate keys, one is chosen arbitrarily to be the primary key. The primary
key attributes are underlined.
Figure 19: The CAR relation, with two candidate keys: LicenseNumber and EngineSerialNumber
Entity Integrity
• Relational Database Schema : A set S of relation schemas that belong to the same database. S is
the name of the database.
S = {R1 , R2 , . . . , Rn }
• Entity Integrity : The primary key attributes P K of each relation schema R in S cannot have null
values in any tuple of r(R). This is because primary key values are used to identify the individual
tuples.
• Note:Other attributes of R may be similarly constrained to disallow null values, even though they
are not members of the primary key.
31
Referential Integrity
Figure 20: Referential integrity constraints displayed on the COMPANY relational database schema
1. a value of an existing primary key value of the corresponding primary key PK in the referenced
relation R2 , or
2. a null.
In case (2), the F K in R1 should not be a part of its own primary key.
32
Figure 21: One possible database state for the COMPANY relational database schema
File: Lectures 2014.tex Date: Tuesday 4th March, 2014 10:13am Revision: 0.3
33