Module 2a-Relational Model
Module 2a-Relational Model
British computer scientist (mathematician?) E.F. (Ted) Codd of IBM first proposed
the relational data model in a seminal paper in 1970 (A Relational Model for Large
Shared Data Banks, Communications of the ACM, June 1970), which led to a
revolution in the field of database management. It is based upon the mathematical
concepts of relations, set theory and predicate logic. Codd, who died in 2003,
went on to write extensively about the subject. As a result of his achievements, Codd
was awarded the highest honour in computer science, the ACM Turing Award, in
1981.
1
Module 2a
Example of a Relation:
Key of a Relation:
Each row has a value of a data item (or set of items) that uniquely identifies that row
in the table Called the key
FORMAL DEFINITIONS:
2
Module 2a
Examples of domains:
• Attribute: the name of the role played by some value (coming from some
domain) in the context of a relational schema. The domain of attribute A is
denoted dom(A).
• Tuple: A tuple is a mapping from attributes to values drawn from the respective
domains of those attributes. A tuple is intended to describe some entity (or
relationship between entities) in the miniworld.
• Relation: A (named) set of tuples all of the same form (i.e., having the same set
of attributes). The term table is a loose synonym. (Some database purists would
argue that a table is "only" a physical manifestation of a relation.)
• Relational Schema: used for describing (the structure of) a relation. E.g., R(A1,
A2, ..., An) says that R is a relation with attributes A1, ... An. The degree of a
relation is the number of attributes it has, here n.
3
Module 2a
• The relation state is a subset of the Cartesian product of the domains of its
attributes
o each domain contains the set of all possible values the attribute can
take.
o Example: attribute Cust-name is defined over the domain of character
strings of maximum length 25
o dom(Cust-name) is varchar(25)
o The role these strings play in the CUSTOMER relation is that of the
name of a customer.
• Formally,
o Given R(A1, A2, .........., An)
o r(R) dom (A1) X dom (A2) X ....X dom(An)
4
Module 2a
Definition Summary:
• If we consider the attributes in R(A1, A2, ..., An) and the values in t=<v1, v2,
..., vn> need to be ordered .
• However, a more general alternative definition of relation does not require
this ordering.
5
Module 2a
If a tuple is viewed as a mapping from its attributes (i.e., the names we give to the
roles played by the values comprising the tuple) to the corresponding values, then
the order in which the attributes are listed in a table is irrelevant.
For a relation to be in First Normal Form, each of its attribute domains must consist
of atomic (neither composite nor multi-valued) values. Each value in a tuple must
be from the domain of the attribute for that column
If tuple t = <v1, v2, …, vn> is a tuple (row) in the relation state r of R(A1,
A2, …, An)
A special null value is used to represent values that are unknown or inapplicable to
certain tuples.
Keep in mind that some relations represent facts about entities (e.g., students)
whereas others represent facts about relationships (between entities). (e.g., students
and course sections).
The closed world assumption states that the only true facts about the miniworld are
those represented by whatever tuples currently populate the database.
5.1.3 Relational Model Notation: R(A1, A2, ..., An) is a relational schema of
degree n denoting that there is a relation R having as its attributes A1, A2, ..., An.
6
Module 2a
5.2.2 Key Constraints: A relation is a set of tuples, and each tuple's "identity" is
given by the values of its attributes. Hence, it makes no sense for two tuples in a
relation to be identical (because then the two tuples are actually one and the same
tuple). That is, no two tuples may have the same combination of values in their
attributes.
Usually the miniworld dictates that there be (proper) subsets of attributes for which
no two tuples may have the same combination of values. Such a set of attributes is
called a superkey of its relation. From the fact that no two tuples can be identical, it
follows that the set of all attributes of a relation constitutes a superkey of that
relation.
• Superkey of R:
o Is a set of attributes SK of R with the following condition:
▪ No two tuples in any valid relation state r(R) will have the same
value for SK
▪ That is, for any distinct tuples t1 and t2 in r(R), t1[SK] t2[SK]
▪ This condition must hold in any valid state r(R)
A key is a minimal superkey, i.e., a superkey such that, if we were to remove any of
its attributes, the resulting set of attributes fails to be a superkey.
7
Module 2a
If a relation has several candidate keys, one is chosen arbitrarily to be the primary
key.
Primary key: a key chosen to act as the means by which to identify tuples in a
relation. Typically, one prefers a primary key to be one having as few attributes as
possible.
8
Module 2a
A relational database schema is a set of schemas for its relations (see Figure 5.5)
together with a set of integrity constraints.
9
Module 2a
Entity Integrity Constraint: In a tuple, none of the values of the attributes forming
the relation's primary key may have the (non-)value null.
• This is because primary key values are used to identify the individual tuples.
• t[PK] null for any tuple t in r(R)
• If PK has several attributes, null is not allowed in any of these attributes
10
Module 2a
This constraint says that, for every tuple in R, the tuple in S to which it refers must
actually be in S. Note that a foreign key may refer to a tuple in the same relation and
that a foreign key may be part of a primary key (indeed, for weak entity types, this
will always occur). A foreign key may have value null, in which case it does not
refer to any tuple in the referenced relation.
11
Module 2a
For each of the update operations (Insert, Delete, and Update), we consider what
kinds of constraint violations may result from applying it and how we might choose
to react.
5.3.1 Insert:
Ways of dealing with it: reject the attempt to insert! Or give user opportunity to try
again with different attribute values.
12
Module 2a
5.3.2 Delete:
5.3.3 Update:
13
Module 2a
5.3.4 Transactions: This concept is relevant in the context where multiple users
and/or application programs are accessing and updating the database concurrently.
A transaction is a logical unit of work that may involve several accesses and/or
updates to the database (such as what might be required to reserve several seats on
an airplane flight). The point is that, even though several transactions might be
processed concurrently, the end result must be as though the transactions were
carried out sequentially. (Example of simultaneous withdrawals from same checking
account.)
14
Module 2a
Exercises:
15
Module 2a
2.
3.
4.
5.
16
Module 2a
6.
17
Module 2a
18
Module 2a
19
Module 2a
20
Module 2a
22
Module 2a
23
Module 2a
Exercises:
1.
2.
24
Module 2a
25
Module 2a
3.
4.
26
Module 2a
5. Consider the following relations for a database that keeps track of student enrollment in
courses and the books adopted for each course:
STUDENT(SSN, Name, Major, Bdate)
COURSE(Course#, Cname, Dept)
ENROLL(SSN, Course#, Quarter, Grade)
BOOK_ADOPTION(Course#, Quarter, Book_ISBN)
TEXT(Book_ISBN, Book_Title, Publisher, Author)
Draw a ER diagram for this schema.
27