CH 2
CH 2
Pratibha Joshi
Database System Concepts - 6th Edition 2.1 ©Silberschatz, Korth and Sudarshan
Relational Model
Relational model represents data in the form of relations or tables.
A row is called a tuple, a column header is called an attribute, and the
table is called a relation.
The set of allowed values for each attribute is called the domain of the
attribute
Example: Names - The set of character strings that represent
names of persons.
Usa_phone_numbers- The set of ten-digit phone numbers valid in
the United States.
Attribute values are (normally) required to be atomic; that is,
indivisible
The special value null is a member of every domain
The null value causes complications in the definition of many
operations
Database System Concepts - 6th Edition 2.2 ©Silberschatz, Korth and Sudarshan
Example of a Relation
attributes
(or columns)
tuples
(or rows)
Database System Concepts - 6th Edition 2.3 ©Silberschatz, Korth and Sudarshan
Relation Schema and Instance
Relational Schema: Schema represents structure of a relation.
e.g.; Relational Schema of STUDENT relation can be
represented as: STUDENT (STUD_NO, STUD_NAME,
STUD_PHONE, STUD_STATE, STUD_COUNTRY,
STUD_AGE)
Relational Instance: The set of values present in a relation at a
particular instance of time.
Degree- number of attributes in a relation.
Cardinality- total number of tuples in a relation.
Database System Concepts - 6th Edition 2.4 ©Silberschatz, Korth and Sudarshan
Codd Rules
Database System Concepts - 6th Edition 2.5 ©Silberschatz, Korth and Sudarshan
Codd Rules
Also, set operations like Union, Intersection and minus should be
supported.
8. Physical data independence: Any modification in the physical
location of a table should not enforce modification at application level.
9. Logical data independence: Any modification in logical or conceptual
schema of a table should not enforce modification at application level. For
example, merging of two tables into one should not affect application
accessing it which is difficult to achieve.
10. Integrity Independence: Integrity constraints modified at database
level should not enforce modification at application level.
11. Distribution Independence: Distribution of data over various
locations should not be visible to end-users.
12. Non-Subversion Rule: Low level access to data should not be able to
bypass integrity rule to change data.
Database System Concepts - 6th Edition 2.6 ©Silberschatz, Korth and Sudarshan
Constraints
Every relation has some conditions that must hold for it to be a valid
relation. These conditions are called Relational Integrity
Constraints. There are three main integrity constraints −
• Key constraints
• Domain constraints
• Referential integrity constraints
These constraints are checked before performing any operation
(insertion, deletion and updation) in database. If there is a violation in
any of constrains, operation will fail.
Domain Constraints -Domain constraints specify that within each
tuple, the value of each attribute must be an atomic value from the
domain i.e. An attribute can only take values which lie inside the
domain range.
Database System Concepts - 6th Edition 2.7 ©Silberschatz, Korth and Sudarshan
Constraints:
e.g,; Consider a relation STUDENT with attributes Roll_no, Name,
Phone and age. If a constrains AGE>0 is applied on STUDENT
relation, inserting negative value of AGE will result in failure.
Key Integrity: There must be at least one minimal subset of
attributes in the relation, which can identify a tuple uniquely.
This minimal subset of attributes is called key for that
relation.
e.g.; ROLL_NO in STUDENT is a key. No two students can have
same roll number. So a key has two properties:
1. It should be unique for all tuples.
2. It can’t have NULL values.
Referential Integrity: When one attribute of a relation can only take
values from other attribute of same relation or any other relation, it is
called referential integrity. Let us suppose we have 2 relations.
Database System Concepts - 6th Edition 2.8 ©Silberschatz, Korth and Sudarshan
Constraints
Database System Concepts - 6th Edition 2.9 ©Silberschatz, Korth and Sudarshan
ANOMALIES
An anomaly is an irregularity, or something which deviates from the
expected or normal state. When designing databases, we identify
three types of anomalies: Insert, Update and Delete.
Insert anomaly:
The Insert operation provides a list of attribute values for a new tuple
t that is to be inserted into a relation R. Insert can violate any of the
four types of constraints.
Domain constraints can be violated if an attribute value is given that
does not appear in the corresponding domain or is not of the
appropriate data type.
Key constraints can be violated if a key value in the new tuple t
already exists in another tuple in the relation r(R).
Database System Concepts - 6th Edition 2.10 ©Silberschatz, Korth and Sudarshan
Database System Concepts - 6th Edition 2.11 ©Silberschatz, Korth and Sudarshan
Insert Anomaly
Entity integrity can be violated if any part of the primary key of the
new tuple t is NULL.
Referential integrity can be violated if the value of any foreign key in
t refers to a tuple that does not exist in the referenced relation
Database System Concepts - 6th Edition 2.12 ©Silberschatz, Korth and Sudarshan
Update Anomaly
If a tuple is updated from referenced relation and referenced attribute value is
used by referencing attribute in referencing relation, it will not allow deleting
the tuple from referenced relation.
Updating an attribute that is neither part of a primary key nor of a foreign key
usually causes no problems. y. If a foreign key attribute is modified, the
DBMS must make sure that the new value refers to an existing tuple in the
referenced relation.
First method involves simultaneously updating those tuples of the
referencing relation where the referencing attribute uses the
referenced attribute value being updated.This method of handling the
violation is called as On Update Cascade.
Second method involves aborting or deleting the request for an
updation of the referenced relation if the value is used by the
referencing relation.
Third method involves setting the value being updated in the
referenced relation to NULL or some other value in the referencing
relation if the referencing attribute uses that value.
Database System Concepts - 6th Edition 2.13 ©Silberschatz, Korth and Sudarshan
Constraints
Database System Concepts - 6th Edition 2.14 ©Silberschatz, Korth and Sudarshan
Delete Anomaly
The Delete operation can violate only referential integrity. This
occurs if the tuple being deleted is referenced by foreign keys from
other tuples in the database. To specify deletion, a condition on the
attributes of the relation selects the tuple (or tuples) to be deleted.
• First method involves simultaneously deleting those tuples from the
referencing relation where the referencing attribute uses the value of
referenced attribute being deleted. This method of handling the
violation is called as On Delete Cascade.
Second method involves aborting or deleting the request for a
deletion from the referenced relation if the value is used by the
referencing relation.
Third method involves setting the value being deleted from the
referenced relation to NULL or some other value in the
referencing relation if the referencing attribute uses that value.
Database System Concepts - 6th Edition 2.15 ©Silberschatz, Korth and Sudarshan
Database
A database consists of multiple relations
Information about an enterprise is broken up into parts
instructor
student
advisor
Bad design:
univ (instructor -ID, name, dept_name, salary, student_Id, ..)
results in
repetition of information (e.g., two students have the same instructor)
the need for null values (e.g., represent an student with no advisor)
Normalization theory deals with how to design “good” relational schemas
Database System Concepts - 6th Edition 2.16 ©Silberschatz, Korth and Sudarshan
Keys in Relational Model
Super Key
Super Key is defined as a set of attributes within a table that can uniquely
identify each record within a table.
Super Key is a superset of Candidate key. Super key would include
student_id, (student_id, name), phone etc.
student_id is unique for every row of data, hence it can be used to identity
each row uniquely.
Next comes, (student_id, name), now name of two students can be same,
but their student_id can't be same hence this combination can also be a key.
Similarly, phone number for every student will be unique, hence again,
phone can also be a key.
So they all are super keys.
Database System Concepts - 6th Edition 2.17 ©Silberschatz, Korth and Sudarshan
Candidate Key
Candidate keys are defined as the minimal set of fields which can
uniquely identify each record in a table.
It is an attribute or a set of attributes that can act as a Primary Key
for a table to uniquely identify each record in that table.
There can be more than one candidate key.
In our example, student_id and phone both are candidate keys for
table Student.
It can contain NULL values.
A table can have multiple candidate keys but only one primary key
(the primary key cannot have a NULL value, so the candidate key
with NULL value can’t be the primary key).
A candidate key can be a combination of more than one
columns(attributes).
Database System Concepts - 6th Edition 2.18 ©Silberschatz, Korth and Sudarshan
Primary Key
Primary key is a candidate key that is most appropriate to become the
main key for any table.
It is a key that can uniquely identify each record in a table.
It has no duplicate values, it has unique values.
It cannot be NULL.
Primary keys are not necessarily to be a single column; more than
one
column can also be a primary key for a table.
For the table student_id can be the primary key.
Database System Concepts - 6th Edition 2.19 ©Silberschatz, Korth and Sudarshan
Foreign Key
If an attribute can only take the values which are present as values of
some other attribute, it will be a foreign key to the attribute to which it
refers.
The relation which is being referenced is called referenced relation and
the corresponding attribute is called referenced attribute.
The relation which refers to the referenced relation is called referencing
relation and the corresponding attribute is called referencing attribute.
The referenced attribute of the referenced relation should be the
primary key to it.
It is a key it acts as a primary key in one table and it acts as
secondary key in another table.
It combines two or more relations (table) at a time.
They act as a cross-reference between the tables
Database System Concepts - 6th Edition 2.20 ©Silberschatz, Korth and Sudarshan
Foreign Key
Database System Concepts - 6th Edition 2.21 ©Silberschatz, Korth and Sudarshan
Secondary or Alternative key
The candidate key which are not selected as primary key are known
as secondary keys or alternative keys.
Phone, Age are considered as secondary key in the relation student.
All the keys which are not primary keys are called alternate keys.
It contains two or more fields to identify two or more records.
These values are repeated.
Database System Concepts - 6th Edition 2.22 ©Silberschatz, Korth and Sudarshan
Database System Concepts - 6th Edition 2.23 ©Silberschatz, Korth and Sudarshan
Thank You
Procedural vs.non-procedural, or declarative
“Pure” languages:
Relational algebra
Tuple relational calculus
Domain relational calculus
Relational operators
Database System Concepts - 6th Edition 2.24 ©Silberschatz, Korth and Sudarshan
Schema Diagram for University Database
Database System Concepts - 6th Edition 2.25 ©Silberschatz, Korth and Sudarshan
Relational Query Languages
Relational algebra is the basic set of operations for the relational
model
These operations enable a user to specify basic retrieval
requests (or queries)
The result of an operation is a new relation, which may have
been formed from one or more input relations
This property makes the algebra “closed” (all objects in
relational algebra are relations)
Pure” languages:
Relational algebra
Tuple relational calculus
Domain relational calculus
Database System Concepts - 6th Edition 2.26 ©Silberschatz, Korth and Sudarshan
Relational Algebra Overview
Relational Algebra consists of several groups of operations
Unary Relational Operations
SELECT (symbol: (sigma))
PROJECT (symbol: (pi))
RENAME (symbol: (rho))
Relational Algebra Operations From Set Theory
UNION ( ), INTERSECTION ( ), DIFFERENCE (or
MINUS, – )
CARTESIAN PRODUCT ( x )
Binary Relational Operations
JOIN (several variations of JOIN exist)
DIVISION
Additional Relational Operations
OUTER JOINS, OUTER UNION
AGGREGATE FUNCTIONS (These compute summary of
information: for example, SUM, COUNT, AVG, MIN, MAX)
Slide 6- 27
Database System Concepts - 6th Edition 2.27 ©Silberschatz, Korth and Sudarshan
Unary Relational Operations: SELECT
The SELECT operation (denoted by (sigma)) is used to select a
subset of the tuples from a relation based on a selection
condition.
The selection condition acts as a filter
Keeps only those tuples that satisfy the qualifying
condition
Tuples satisfying the condition are selected whereas the
other tuples are discarded (filtered out)
Examples:
Select the EMPLOYEE tuples whose department number is 4:
DNO = 4 (EMPLOYEE)
Select the employee tuples whose salary is greater than
$30,000:
SALARY > 30,000 (EMPLOYEE)
Slide 6- 28
Database System Concepts - 6th Edition 2.28 ©Silberschatz, Korth and Sudarshan
Unary Relational Operations: SELECT
Slide 6- 29
Database System Concepts - 6th Edition 2.29 ©Silberschatz, Korth and Sudarshan
(contd.)
SELECT Operation Properties
The SELECT operation <selection condition>(R) produces a
relation S that has the same schema (same attributes) as R
SELECT is commutative:
<condition1>( < condition2> (R)) = <condition2> ( < condition1> (R))
Because of commutativity property, a cascade (sequence)
of SELECT operations may be applied in any order:
<cond1>(<cond2> (<cond3> (R)) = <cond2> (<cond3> (<cond1>
( R)))
A cascade of SELECT operations may be replaced by a
single selection with a conjunction of all the conditions:
<cond1>(< cond2> (<cond3>(R)) = <cond1> AND < cond2> AND <
cond3>(R)))
The number of tuples in the result of a SELECT is less than
(or equal to) the number of tuples in the input relation R
Slide 6- 30
Database System Concepts - 6th Edition 2.30 ©Silberschatz, Korth and Sudarshan
Selection of tuples
Relation r
Database System Concepts - 6th Edition 2.31 ©Silberschatz, Korth and Sudarshan
Unary Relational Operations: PROJECT
PROJECT Operation is denoted by (pi)
This operation keeps certain columns (attributes) from a relation and
discards the other columns.
PROJECT creates a vertical partitioning
The list of specified columns (attributes) is kept in each tuple
The other attributes in each tuple are discarded
Example: To list each employee’s first and last name and salary, the
following is used:
LNAME, FNAME,SALARY(EMPLOYEE)
Slide 6- 32
Database System Concepts - 6th Edition 2.32 ©Silberschatz, Korth and Sudarshan
Unary Relational Operations: PROJECT
PROJECT Operation is denoted by (pi)
This operation keeps certain columns (attributes) from a relation and
discards the other columns.
PROJECT creates a vertical partitioning
The list of specified columns (attributes) is kept in each tuple
The other attributes in each tuple are discarded
Example: To list each employee’s first and last name and salary, the
following is used:
LNAME, FNAME,SALARY(EMPLOYEE)
Slide 6- 33
Database System Concepts - 6th Edition 2.33 ©Silberschatz, Korth and Sudarshan
Unary Relational Operations: PROJECT
PROJECT Operation is denoted by (pi)
This operation keeps certain columns (attributes) from a relation and
discards the other columns.
PROJECT creates a vertical partitioning
The list of specified columns (attributes) is kept in each tuple
The other attributes in each tuple are discarded
Example: To list each employee’s first and last name and salary, the
following is used:
LNAME, FNAME,SALARY(EMPLOYEE)
Slide 6- 34
Database System Concepts - 6th Edition 2.34 ©Silberschatz, Korth and Sudarshan
Selection of Columns (Attributes)
Relation r:
Select A and C
Projection
Π A, C (r)
Database System Concepts - 6th Edition 2.35 ©Silberschatz, Korth and Sudarshan
Single expression versus sequence of
relational operations (Example)
To retrieve the first name, last name, and salary of
all employees who work in department number 5, we
must apply a select and a project operation
We can write a single relational algebra expression
as follows:
FNAME, LNAME, SALARY( DNO=5(EMPLOYEE))
OR We can explicitly show the sequence of
operations, giving a name to each intermediate
relation:
DEP5_EMPS DNO=5(EMPLOYEE)
RESULT FNAME, LNAME, SALARY (DEP5_EMPS)
Slide 6- 36
Database System Concepts - 6th Edition 2.36 ©Silberschatz, Korth and Sudarshan
Unary Relational Operations: RENAME
The RENAME operator is denoted by (rho)
In some cases, we may want to rename the attributes of a relation or
the relation name or both
Slide 6- 37
Database System Concepts - 6th Edition 2.37 ©Silberschatz, Korth and Sudarshan
Relational Algebra Operations from
Set Theory: UNION
UNION Operation
Binary operation, denoted by
The result of R S, is a relation that includes all tuples that are
either in R or in S or in both R and S
Duplicate tuples are eliminated
The two operand relations R and S must be
“type compatible” (or UNION compatible)
R and S must have same number of attributes
Each pair of corresponding attributes must be
type compatible (have same or compatible
domains)
Slide 6- 38
Database System Concepts - 6th Edition 2.38 ©Silberschatz, Korth and Sudarshan
Union of two relations
Relations r, s:
r s:
Database System Concepts - 6th Edition 2.39 ©Silberschatz, Korth and Sudarshan
Relational Algebra Operations from Set
Theory: INTERSECTION
INTERSECTION is denoted by
The result of the operation R S, is a
relation that includes all tuples that are in
both R and S
The attribute names in the result will be the
same as the attribute names in R
The two operand relations R and S must
be “type compatible”
Slide 6- 40
Database System Concepts - 6th Edition 2.40 ©Silberschatz, Korth and Sudarshan
Set Intersection of two relations
Relation r, s:
rs
Database System Concepts - 6th Edition 2.41 ©Silberschatz, Korth and Sudarshan
Relational Algebra Operations from Set
Theory: SET DIFFERENCE (cont.)
Slide 6- 42
Database System Concepts - 6th Edition 2.42 ©Silberschatz, Korth and Sudarshan
Set difference of two relations
Relations r, s:
r – s:
Database System Concepts - 6th Edition 2.43 ©Silberschatz, Korth and Sudarshan
Relational Algebra Operations from Set
Theory: CARTESIAN PRODUCT
CARTESIAN (or CROSS) PRODUCT Operation
This operation is used to combine tuples from two
relations in a combinatorial fashion.
Denoted by R(A1, A2, . . ., An) x S(B1, B2, . . ., Bm)
Result is a relation Q with degree n + m attributes:
Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order.
The resulting relation state has one tuple for each
combination of tuples—one from R and one from S.
Hence, if R has nR tuples (denoted as |R| = nR ), and S
has nS tuples, then R x S will have n R * nS tuples.
The two operands do NOT have to be "type
compatible”
Slide 6- 44
Database System Concepts - 6th Edition 2.44 ©Silberschatz, Korth and Sudarshan
Joining two relations – Cartesian Product
Relations r, s:
r x s:
Database System Concepts - 6th Edition 2.45 ©Silberschatz, Korth and Sudarshan
Binary Relational Operations: JOIN
JOIN Operation (denoted by )
The sequence of CARTESIAN PRODECT followed by
SELECT is used quite commonly to identify and select
related tuples from two relations
A special operation, called JOIN combines this
sequence into a single operation
This operation is very important for any relational
database with more than a single relation, because it
allows us combine related tuples from various
relations
The general form of a join operation on two relations
R(A1, A2, . . ., An) and S(B1, B2, . . ., Bm) is:
R <join condition>S
where R and S can be any relations that result from
general relational algebra expressions.
Slide 6- 46
Database System Concepts - 6th Edition 2.46 ©Silberschatz, Korth and Sudarshan
Natural Join Example
Relations r, s:
Natural Join
r s
Database System Concepts - 6th Edition 2.47 ©Silberschatz, Korth and Sudarshan
Figure in-2.1
Database System Concepts - 6th Edition 2.48 ©Silberschatz, Korth and Sudarshan
Conditional join
Conditional join works similar to natural join. In natural join, by default condition
is equal between common attribute while in conditional join we can specify the
any condition such as greater than, less than, not equal
Let us see below example
Database System Concepts - 6th Edition 2.49 ©Silberschatz, Korth and Sudarshan
R.ID R.Sex R.Marks S.ID S.Sex S.Marks
-----------------------------------------------
1 F 45 10 M 20
1 F 45 11 M 22
2 F 55 10 M 20
2 F 55 11 M 22
3 F 60 10 M 20
3 F 60 11 M 22
3 F 60 12 M 59
Database System Concepts - 6th Edition 2.50 ©Silberschatz, Korth and Sudarshan
Database System Concepts - 6th Edition 2.51 ©Silberschatz, Korth and Sudarshan
Database System Concepts - 6th Edition 2.52 ©Silberschatz, Korth and Sudarshan
Database System Concepts - 6th Edition 2.53 ©Silberschatz, Korth and Sudarshan
Thank You
Database System Concepts - 6th Edition 2.54 ©Silberschatz, Korth and Sudarshan