0% found this document useful (0 votes)
25 views

Module 2a-Relational Model

Uploaded by

harshithhs14
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Module 2a-Relational Model

Uploaded by

harshithhs14
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Module 2a

The Relational Data Model and Relational Database


Constraints and Relational Algebra
Origins

British computer scientist (mathematician?) E.F. (Ted) Codd of IBM first proposed
the relational data model in a seminal paper in 1970 (A Relational Model for Large
Shared Data Banks, Communications of the ACM, June 1970), which led to a
revolution in the field of database management. It is based upon the mathematical
concepts of relations, set theory and predicate logic. Codd, who died in 2003,
went on to write extensively about the subject. As a result of his achievements, Codd
was awarded the highest honour in computer science, the ACM Turing Award, in
1981.

The first commercial implementations of the relational model became available in


the early 1980's (e.g., SQL/DS by IBM and Oracle DBMS) and, since then, the
model has been implemented in a large number of commercial systems. Currently
popular DBMS's based upon the relational model include DB2 (IBM), Oracle, SQL
Server (Microsoft), PostgreSQL and MySQL

5.1 Relational Model Concepts

• The relational Model of Data is based on the concept of a Relation


o The strength of the relational approach to data management comes
from the formal foundation provided by the theory of relations
• A Relation is a mathematical concept based on the ideas of sets
• Informally, a relation looks like a table of values.
• A relation typically contains a set of rows.
• The data elements in each row represent certain facts that correspond to a
real-world entity or relationship
o In the formal model, rows are called tuples
o Each column has a column header that gives an indication of the
meaning of the data items in that column
o In the formal model, the column header is called an attribute name
(or just attribute)

1
Module 2a

Example of a Relation:

Key of a Relation:

Each row has a value of a data item (or set of items) that uniquely identifies that row
in the table Called the key

In the STUDENT table, SSN is the key

FORMAL DEFINITIONS:

• The Schema (or description) of a Relation:


o Denoted by R(A1, A2, .....An)
o R is the name of the relation
o The attributes of the relation are A1, A2, ..., An
• Example:

CUSTOMER (Cust-id, Cust-name, Address, Phone#)

o CUSTOMER is the relation name


o Defined over the four attributes: Cust-id, Cust-name, Address, Phone#
o Each attribute has a domain or a set of valid values.
o For example, the domain of Cust-id is 6 digit numbers.
• A tuple is an ordered set of values (enclosed in angled brackets ‘< … >’)
• Each value is derived from an appropriate domain.
• A row in the CUSTOMER relation is a 4-tuple and would consist of four
values, for example:
• <632895, "John Smith", "101 Main St. Atlanta, GA 30332", "(404) 894-
2000">
• This is called a 4-tuple as it has 4 values
• A tuple (row) in the CUSTOMER relation.
• A relation is a set of such tuples (rows)

2
Module 2a

• Domain: A (usually named) set/universe of atomic values, where by "atomic" we


mean simply that, from the point of view of the database, each value in the domain
is indivisible (i.e., cannot be broken down into component parts).

Examples of domains:

o USA_phone_number: string of digits of length ten


o SSN: string of digits of length nine
o Name: string of characters beginning with an upper case letter
o GPA: a real number between 0.0 and 4.0
o Sex: a member of the set { female, male }
o Dept_Code: a member of the set { CMPS, MATH, ENGL, PHYS,
PSYC, ... }

These are all logical descriptions of domains. For implementation purposes,


it is necessary to provide descriptions of domains in terms of concrete data
types (or formats) that are provided by the DBMS (such
as String, int, boolean), in a manner analogous to how programming
languages have intrinsic data types.

• Attribute: the name of the role played by some value (coming from some
domain) in the context of a relational schema. The domain of attribute A is
denoted dom(A).
• Tuple: A tuple is a mapping from attributes to values drawn from the respective
domains of those attributes. A tuple is intended to describe some entity (or
relationship between entities) in the miniworld.

As an example, a tuple for a PERSON entity might be


{ Name --> "Rumpelstiltskin", Sex --> Male, IQ --> 143 }

• Relation: A (named) set of tuples all of the same form (i.e., having the same set
of attributes). The term table is a loose synonym. (Some database purists would
argue that a table is "only" a physical manifestation of a relation.)
• Relational Schema: used for describing (the structure of) a relation. E.g., R(A1,
A2, ..., An) says that R is a relation with attributes A1, ... An. The degree of a
relation is the number of attributes it has, here n.

Example: STUDENT(Name, SSN, Address)

(See Figure 5.1, for an example of a STUDENT relation/table having several


tuples/rows.)

3
Module 2a

• The relation state is a subset of the Cartesian product of the domains of its
attributes
o each domain contains the set of all possible values the attribute can
take.
o Example: attribute Cust-name is defined over the domain of character
strings of maximum length 25
o dom(Cust-name) is varchar(25)
o The role these strings play in the CUSTOMER relation is that of the
name of a customer.

• Formally,
o Given R(A1, A2, .........., An)
o r(R)  dom (A1) X dom (A2) X ....X dom(An)

• R(A1, A2, …, An) is the schema of the relation

• R is the name of the relation

• A1, A2, …, An are the attributes of the relation

• r(R): a specific state (or "value" or “population”) of relation R – this is a set of


tuples (rows)

o r(R) = {t1, t2, …, tn} where each ti is an n-tuple


o ti = <v1, v2, …, vn> where each vj element-of dom(Aj)

• Let R(A1, A2) be a relation schema:


o Let dom(A1) = {0,1}
o Let dom(A2) = {a,b,c}

• Then: dom(A1) X dom(A2) is all possible combinations:


o {<0,a> , <0,b> , <0,c>, <1,a>, <1,b>, <1,c> }

• The relation state r(R)  dom(A1) X dom(A2)

• For example: r(R) could be {<0,a> , <0,b> , <1,c> }

o this is one possible state (or “population” or “extension”) r of the


relation R, defined over A1 and A2.
o It has three 2-tuples: <0,a> , <0,b> , <1,c>

4
Module 2a

Definition Summary:

Informal Terms Formal Terms


Table Relation
Column Header Attribute
All possible Column Values Domain
Row Tuple
Table Definition Schema of a Relation

Populated Table State of the Relation

• Relational Database: A collection of relations, each one consistent with its


specified relational schema.

5.1.2 Characteristics of Relations

i) Ordering of Tuples: A relation is a set of tuples; hence, there is no order


associated with them. That is, it makes no sense to refer to, for example, the 5th tuple
in a relation. When a relation is depicted as a table, the tuples are necessarily listed
in some order, of course, but you should attach no significance to that order.
Similarly, when tuples are represented on a storage device, they must be organized
in some fashion, and it may be advantageous, from a performance standpoint, to
organize them in a way that depends upon their content.

ii) Ordering of Attributes:

• If we consider the attributes in R(A1, A2, ..., An) and the values in t=<v1, v2,
..., vn> need to be ordered .
• However, a more general alternative definition of relation does not require
this ordering.

5
Module 2a

If a tuple is viewed as a mapping from its attributes (i.e., the names we give to the
roles played by the values comprising the tuple) to the corresponding values, then
the order in which the attributes are listed in a table is irrelevant.

iii) Values of Attributes:

For a relation to be in First Normal Form, each of its attribute domains must consist
of atomic (neither composite nor multi-valued) values. Each value in a tuple must
be from the domain of the attribute for that column

If tuple t = <v1, v2, …, vn> is a tuple (row) in the relation state r of R(A1,
A2, …, An)

Then each vi must be a value from dom(Ai)

A special null value is used to represent values that are unknown or inapplicable to
certain tuples.

iv) Interpretation of a Relation: Each relation can be viewed as a predicate and


each tuple in that relation can be viewed as an assertion for which that predicate is
satisfied (i.e., has value true) for the combination of values in it. In other words,
each tuple represents a fact. Example (see Figure 5.1): The first tuple listed means:
There exists a student having name Benjamin Bayer, having SSN 305-61-2435,
having age 19, etc.

Keep in mind that some relations represent facts about entities (e.g., students)
whereas others represent facts about relationships (between entities). (e.g., students
and course sections).

The closed world assumption states that the only true facts about the miniworld are
those represented by whatever tuples currently populate the database.

5.1.3 Relational Model Notation: R(A1, A2, ..., An) is a relational schema of
degree n denoting that there is a relation R having as its attributes A1, A2, ..., An.

• By convention, Q, R, and S denote relation names.


• By convention, q, r, and s denote relation states. For example, r(R) denotes
one possible state of relation R. If R is understood from context, this could be
written, more simply, as r.
• By convention, t, u, and v denote tuples.
• The "dot notation" R.A (e.g., STUDENT.Name) is used to qualify an attribute
name, usually for the purpose of distinguishing it from a same-named attribute
in a different relation (e.g., DEPARTMENT.Name).

6
Module 2a

5.2 Relational Model Constraints and Relational Database Schemas


Constraints are conditions that must hold on all valid relation states.
Constraints on databases can be categorized as follows:
• inherent model-based: Example: no two tuples in a relation can be
duplicates (because a relation is a set of tuples)
• schema-based: can be expressed using DDL; this kind is the focus of this
section.
• application-based: are specific to the "business rules" of the miniworld and
typically difficult or impossible to express and enforce within the data model.
Hence, it is left to application programs to enforce.
Elaborating upon schema-based constraints:
5.2.1 Domain Constraints: Each attribute value must be either null (which is really
a non-value) or drawn from the domain of that attribute. Note that some DBMS's
allow you to impose the not null constraint upon an attribute, which is to say that
that attribute may not have the (non-)value null.

5.2.2 Key Constraints: A relation is a set of tuples, and each tuple's "identity" is
given by the values of its attributes. Hence, it makes no sense for two tuples in a
relation to be identical (because then the two tuples are actually one and the same
tuple). That is, no two tuples may have the same combination of values in their
attributes.

Usually the miniworld dictates that there be (proper) subsets of attributes for which
no two tuples may have the same combination of values. Such a set of attributes is
called a superkey of its relation. From the fact that no two tuples can be identical, it
follows that the set of all attributes of a relation constitutes a superkey of that
relation.

• Superkey of R:
o Is a set of attributes SK of R with the following condition:
▪ No two tuples in any valid relation state r(R) will have the same
value for SK
▪ That is, for any distinct tuples t1 and t2 in r(R), t1[SK]  t2[SK]
▪ This condition must hold in any valid state r(R)

A key is a minimal superkey, i.e., a superkey such that, if we were to remove any of
its attributes, the resulting set of attributes fails to be a superkey.

Example: Suppose that we stipulate that a faculty member is uniquely identified


by Name and Address and also by Name and Department, but by no single one of
the three attributes mentioned. Then { Name, Address, Department } is a (non-

7
Module 2a

minimal) superkey and each of { Name, Address } and { Name, Department } is a


key (i.e., minimal superkey).

• Example: Consider the CAR relation schema:


o CAR(State, Reg#, SerialNo, Make, Model, Year)
o CAR has two keys:
▪ Key1 = {State, Reg#}
▪ Key2 = {SerialNo}
o Both are also superkeys of CAR
o {SerialNo, Make} is a superkey but not a key.
• In general:
o Any key is a superkey (but not vice versa)
o Any set of attributes that includes a key is a superkey
o A minimal superkey is also a key

Candidate key: any key!

If a relation has several candidate keys, one is chosen arbitrarily to be the primary
key.

• The primary key attributes are underlined.


• Example: Consider the CAR relation schema:
o CAR(State, Reg#, SerialNo, Make, Model, Year)
o We chose SerialNo as the primary key
• The primary key value is used to uniquely identify each tuple in a relation
o Provides the tuple identity
• Also used to reference the tuple from another tuple
o General rule: Choose as primary key the smallest of the candidate keys (in
terms of size)
o Not always applicable – choice is sometimes subjective

Primary key: a key chosen to act as the means by which to identify tuples in a
relation. Typically, one prefers a primary key to be one having as few attributes as
possible.

8
Module 2a

5.2.3 Relational Databases and Relational Database Schemas

A relational database schema is a set of schemas for its relations (see Figure 5.5)
together with a set of integrity constraints.

• Relational Database Schema:


o A set S of relation schemas that belong to the same database.
o S is the name of the whole database schema
o S = {R1, R2, ..., Rn}
o R1, R2, …, Rn are the names of the individual relation schemas within
the database S

A relational database state/instance/snapshot is a set of states of its relations such


that no integrity constraint is violated. (See Figure 5.6, for a snapshot of
COMPANY)

9
Module 2a

5.2.4 Entity Integrity, Referential Integrity, and Foreign Keys

Entity Integrity Constraint: In a tuple, none of the values of the attributes forming
the relation's primary key may have the (non-)value null.

• This is because primary key values are used to identify the individual tuples.
• t[PK]  null for any tuple t in r(R)
• If PK has several attributes, null is not allowed in any of these attributes

10
Module 2a

Referential Integrity Constraint: (See Figure 5.7) A foreign key of relation R is a


set of its attributes intended to be used (by each tuple in R) for identifying/referring
to a tuple in some relation S. (R is called the referencing relation
and S the referenced relation.) For this to make sense, the set of attributes
of R forming the foreign key should "correspond to" some superkey of S. Indeed, by
definition we require this superkey to be the primary key of S.

This constraint says that, for every tuple in R, the tuple in S to which it refers must
actually be in S. Note that a foreign key may refer to a tuple in the same relation and
that a foreign key may be part of a primary key (indeed, for weak entity types, this
will always occur). A foreign key may have value null, in which case it does not
refer to any tuple in the referenced relation.

• Statement of the constraint


o The value in the foreign key column (or columns) FK of the the
referencing relation R1 can be either:
▪ (1) a value of an existing primary key value of a corresponding
primary key PK in the referenced relation R2, or
▪ (2) a null.
• In case (2), the FK in R1 should not be a part of its own primary key.

11
Module 2a

Semantic Integrity Constraints: application-specific restrictions that are unlikely


to be expressible in DDL. Examples:

• salary of a supervisee cannot be greater than that of her/his supervisor


• salary of an employee cannot be lowered
• the max. no. of hours per employee for all projects he or she works on is 56
hrs per week

A constraint specification language may have to be used to express these

SQL-99 allows triggers and ASSERTIONS to express for some of these

5.3 Update Operations and Dealing with Constraint Violations

• Basic operations for changing the database:


o INSERT a new tuple in a relation
o DELETE an existing tuple from a relation
o MODIFY an attribute of an existing tuple

For each of the update operations (Insert, Delete, and Update), we consider what
kinds of constraint violations may result from applying it and how we might choose
to react.

5.3.1 Insert:

• domain constraint violation: some attribute value is not of correct domain


• entity integrity violation: key of new tuple is null
• key constraint violation: key of new tuple is same as existing one
• referential integrity violation: foreign key of new tuple refers to non-existent
tuple

Ways of dealing with it: reject the attempt to insert! Or give user opportunity to try
again with different attribute values.

12
Module 2a

5.3.2 Delete:

• referential integrity violation: a tuple referring to the deleted one exists.

Three options for dealing with it:

• Reject the deletion


• Attempt to cascade (or propagate) by deleting any referencing tuples (plus
those that reference them, etc., etc.)
• modify the foreign key attribute values in referencing tuples to null or to
some valid value referencing a different tuple

5.3.3 Update:

• Key constraint violation: primary key is changed so as to become same as


another tuple's
• referential integrity violation:
o foreign key is changed and new one refers to nonexistent tuple
o primary key is changed and now other tuples that had referred to this
one violate the constraint

13
Module 2a

5.3.4 Transactions: This concept is relevant in the context where multiple users
and/or application programs are accessing and updating the database concurrently.
A transaction is a logical unit of work that may involve several accesses and/or
updates to the database (such as what might be required to reserve several seats on
an airplane flight). The point is that, even though several transactions might be
processed concurrently, the end result must be as though the transactions were
carried out sequentially. (Example of simultaneous withdrawals from same checking
account.)

14
Module 2a

Exercises:

15
Module 2a

2.

3.

4.

5.
16
Module 2a

6.

17
Module 2a

ER-to-Relational Mapping Algorithm


o Step 1: Mapping of Regular Entity Types
o Step 2: Mapping of Weak Entity Types
o Step 3: Mapping of Binary 1:1 Relation Types
o Step 4: Mapping of Binary 1:N Relationship Types.
o Step 5: Mapping of Binary M:N Relationship Types.
o Step 6: Mapping of Multivalued attributes.
o Step 7: Mapping of N-ary Relationship Types.

• Step 1: Mapping of Regular Entity Types


o For each regular (strong) entity type E in the ER schema, create a
relation R that includes all the simple attributes of E.
o Choose one of the key attributes of E as the primary key for R.
o If the chosen key of E is composite, the set of simple attributes that
form it will together form the primary key of R.
• Example: We create the relations EMPLOYEE, DEPARTMENT, and
PROJECT in the relational schema corresponding to the regular entities in the
ER diagram.
o SSN, DNUMBER, and PNUMBER are the primary keys for the
relations EMPLOYEE, DEPARTMENT, and PROJECT as shown.

18
Module 2a

• Step 2: Mapping of Weak Entity Types


o For each weak entity type W in the ER schema with owner entity type
E, create a relation R & include all simple attributes (or simple
components of composite attributes) of W as attributes of R.
o Also, include as foreign key attributes of R the primary key attribute(s)
of the relation(s) that correspond to the owner entity type(s).
o The primary key of R is the combination of the primary key(s) of the
owner(s) and the partial key of the weak entity type W, if any.
• Example: Create the relation DEPENDENT in this step to correspond to the
weak entity type DEPENDENT.
o Include the primary key SSN of the EMPLOYEE relation as a foreign
key attribute of DEPENDENT (renamed to ESSN).
o The primary key of the DEPENDENT relation is the combination
{ESSN, DEPENDENT_NAME} because DEPENDENT_NAME is
the partial key of DEPENDENT.

• Step 3: Mapping of Binary 1:1 Relation Types


o For each binary 1:1 relationship type R in the ER schema, identify the
relations S and T that correspond to the entity types participating in R.
• There are three possible approaches:
o Foreign Key approach: Choose one of the relations-say S-and include
a foreign key in S the primary key of T. It is better to choose an entity
type with total participation in R in the role of S.
▪ Example: 1:1 relation MANAGES is mapped by choosing the
participating entity type DEPARTMENT to serve in the role of
S, because its participation in the MANAGES relationship type
is total.
o Merged relation option: An alternate mapping of a 1:1 relationship type
is possible by merging the two entity types and the relationship into a

19
Module 2a

single relation. This may be appropriate when both participations are


total.
o Cross-reference or relationship relation option: The third alternative is
to set up a third relation R for the purpose of cross-referencing the
primary keys of the two relations S and T representing the entity types.

• Step 4: Mapping of Binary 1:N Relationship Types


o For each regular binary 1:N relationship type R, identify the relation S
that represent the participating entity type at the N-side of the
relationship type.
o Include as foreign key in S the primary key of the relation T that
represents the other entity type participating in R.
o Include any simple attributes of the 1:N relation type as attributes of S.
• Example: 1:N relationship types WORKS_FOR, CONTROLS, and
SUPERVISION in the figure.
o For WORKS_FOR we include the primary key DNUMBER of the
DEPARTMENT relation as foreign key in the EMPLOYEE relation
and call it DNO.

20
Module 2a

• Step 5: Mapping of Binary M:N Relationship Types


o For each regular binary M:N relationship type R, create a new relation
S to represent R.
o Include as foreign key attributes in S the primary keys of the relations
that represent the participating entity types; their combination will form
the primary key of S.
o Also include any simple attributes of the M:N relationship type (or
simple components of composite attributes) as attributes of S.
• Example: The M:N relationship type WORKS_ON from the ER diagram is
mapped by creating a relation WORKS_ON in the relational database
schema.
o The primary keys of the PROJECT and EMPLOYEE relations are
included as foreign keys in WORKS_ON and renamed PNO and
ESSN, respectively.
o Attribute HOURS in WORKS_ON represents the HOURS attribute of
the relation type. The primary key of the WORKS_ON relation is the
combination of the foreign key attributes {ESSN, PNO}.

• Step 6: Mapping of Multivalued attributes


o For each multivalued attribute A, create a new relation R.
o This relation R will include an attribute corresponding to A, plus the
primary key attribute K-as a foreign key in R-of the relation that
represents the entity type of relationship type that has A as an attribute.
o The primary key of R is the combination of A and K. If the multivalued
attribute is composite, we include its simple components.
• Example: The relation DEPT_LOCATIONS is created.
o The attribute DLOCATION represents the multivalued attribute
LOCATIONS of DEPARTMENT, while DNUMBER-as foreign key-
represents the primary key of the DEPARTMENT relation.
o The primary key of R is the combination of {DNUMBER,
DLOCATION}.
21
Module 2a

• Step 7: Mapping of N-ary Relationship Types


o For each n-ary relationship type R, where n>2, create a new
relationship S to represent R.
o Include as foreign key attributes in S the primary keys of the relations
that represent the participating entity types.
o Also include any simple attributes of the n-ary relationship type (or
simple components of composite attributes) as attributes of S.
• Example: The relationship type SUPPY in the ER on the next slide.
o This can be mapped to the relation SUPPLY shown in the relational
schema, whose primary key is the combination of the three foreign
keys {SNAME, PARTNO, PROJNAME}

22
Module 2a

Correspondence between ER and Relational Models

ER Model Relational Model


Entity type “Entity” relation
1:1 or 1:N relationship type Foreign key (or “relationship” relation)
M:N relationship type “Relationship” relation and two foreign keys
n-ary relationship type “Relationship” relation and n foreign keys
Simple attribute Attribute
Composite attribute Set of simple component attributes
Multivalued attribute Relation and foreign key
Value set Domain
Key attribute Primary (or secondary) key

23
Module 2a

Exercises:
1.

2.

24
Module 2a

25
Module 2a

3.

4.

Convert the above ER Diagram into relational schema

26
Module 2a

5. Consider the following relations for a database that keeps track of student enrollment in
courses and the books adopted for each course:
STUDENT(SSN, Name, Major, Bdate)
COURSE(Course#, Cname, Dept)
ENROLL(SSN, Course#, Quarter, Grade)
BOOK_ADOPTION(Course#, Quarter, Book_ISBN)
TEXT(Book_ISBN, Book_Title, Publisher, Author)
Draw a ER diagram for this schema.

27

You might also like