DBMS UNIT II Relational Algebra
DBMS UNIT II Relational Algebra
RELATIONAL MODEL
Structure of Relational Database: A relational database consists of a collection of tables. Each table
is assigned a unique name. A row in a table represents a relationship among a set of values. For basic
structure, consider the deposit table of the following figure:
Domain: A set of permitted values of each attribute of a table is called as domain. The above table has
four attributes: Branch Name, Account Number, Customer Name, and Balance. For each attribute, there
is a set of permitted values, called the Domain of that attribute.
For example, for the attribute Branch Name, the domain is the set of all branch names. Similarly for the
Balance, the domain is the all balance values.
Ex: In the above relation Deposit, there are four attributes. D1 is the domain of branch Names, D2 is the
set of all account numbers, D3 is the set of all customer names and D4 is the set of all balance values.
Every row in a deposit relation consists of 4-tuples (v1, v2, v3, v4) where v1 is the branch Name (i.e., v1
is in domain D1), v2 is an account number (i.e., v2 is in domain D2), v3 is the customer name (i.e., v3 is
in domain D3) and v4 is the balance (i.e., v4 is in domain D4). In general, deposit will contain only a
subset of all possible rows.
i.e., Deposit is a subset of D1 X D2 X D3 X D4
TUPLE: Row of a given flat file (table) is called a tuple of the relation.
Ex: In the above relation, there are five tuples (rows), i.e., the given relation cardinality is five.
Degree of a Relation: The number of the attributes in a given relation is called degree of the relation.
Ex: The degree of the given relation is 4.
KEYS IN A RELATIONAL DATABASE
KEY: A data item (attribute) is used to identify or locate a record is called a key (entity identifier).
VARIOUS_KEYS_IN_A_RELATIONAL_DATABASE
1) Primary key 2) Candidate key 3) Alternate key 4) Secondary key 5) Super key 6) Foreign key
1) PRIMARY KEY: The primary key is defined as a data item (attribute) which uniquely identifies a
record (one row) in a relation.
Roll-No Name Date-of-Birth Second-Language Division
95 Swetha 31st Dec Hindi First
96 Dhatri 26th Jan Sanskrit Second
97 Shivani 15th Aug Telugu First
98 Kavya 01st Nov Hindi Second
99 Jagruti 29th Feb Sanskrit First
100 Shravani 26th Jan Telugu First
The student Relation
In the above relation student, we can choose roll number attribute as primary key because for each row,
there is a unique value of Roll number attribute i.e., same roll number is not repeated in another rows.
2) CANDIDATE KEY: In a relation in which there is more than one attribute combination possessing
the unique identification property, all the various combinations of attributes, which serve as a
primary key are called the candidate keys of the given relation.
Note: Subset of Candidate keys is not a Primary Key.
{Roll number} can identify each row uniquely. So it is one candidate key.
{Second Language, Date of Birth} can identify each row uniquely; it is one more candidate key.
{Date of Birth, Division} can identify each row uniquely so it is one more candidate key.
{Roll Number, Date of Birth} can identify each row uniquely. But subset of this key is {Roll Number}
{Date of Birth}.
So Here {Roll Number] is a primary key. As per definition, subset of candidate keys not a primary key.
So {Roll Number, Date of Birth} is not a candidate key.
3) ALTERNATE KEY: A candidate key that is not the primary key, called as alternate key.
{Roll number} is unique for every student. Similarly all student names are unique (no students having
the same name). Here we can choose either Roll number or name as a primary key.
In such a case we may arbitrarily choose one of the candidates. Say Roll Number as the primary key for
the relation. A candidate key that is not the primary key, such Name in the relation student is called
alternate key.
i.e., if Roll number is primary key, then Name is alternate key. If Name is primary key, then Roll
Number is alternate key.
4) SECONDARY KEY: System may also use a key which does not identify a unique record or tuple
but which identifies all those which have certain property. This is referred to as a Secondary Key.
Ex: In a given relation STUDENT the value of the attribute “second language” may be used as a
secondary key. This key could be used to identify those entities (students), who belong to second
language Hindi, Telugu or Sanskrit.
Second-Language Roll-No
Hindi 95
98
Sanskrit 96
99
Telugu 97
100
Secondary Keys
5) SUPER_KEY: More than one attribute combined together for unique identification of the record is
defined as a Super Key As shown in figure below, neither supplier no. (S#), nor product no. (P#) are
enough to identify the each row. To get unique information for each row, we need combined
attributes s#, p#. i.e. {s# + p#} is a Super key (or) concatenate key.
S# P# Qty
S1 P1 500
S1 P2 700
S1 P3 450
S2 P4 920
S3 P1 650
S3 P5 400
Super Keys
6) FOREIGN KEY: A foreign key is an attribute or group of fields in a database record that points to
a key field or group of fields forming a key of another database record in a different table. Usually a
foreign key in one table refers to the primary key of another table. This way references can be made
to link. For Ex: For an Account relation Customer_ID is considered as the foreign key.
QUERY LANGUAGES
A query language is a language in which a user requests information from the database. These languages
are typically of a higher level than standard programming languages. Query languages can be
categorized as being either
a) Procedural
b) Non-procedural.
In a procedural language, the user instructs the system to perform a sequence of operations on the
database to compute the desired result. Relational algebra is called as procedural query language. In a
nonprocedural language, the user describes the information desired without giving a specific procedure
for obtaining the information. Tuple relational calculus and domain relational calculus is belong to
nonprocedural query language. The relational algebra is procedural while the relational calculus and the
domain relational calculus are nonprocedural.
RELATIONAL ALGEBRA
The relational algebra is a procedural query language. It consists of a set of operations that take one or
two relations as input and produce a new relation as their result. The fundamental operations in the
relational algebra are select, project, union, set difference, Cartesian product, and rename. In addition to
the fundamental operations, there are several other operations—namely, set intersection, natural join,
and assignment. We shall define these operations in terms of the fundamental operations.
Fundamental Operations
The select, project, and rename operations are called unary operations, because they operate on one
relation. The other three operations operate on pairs of relations and are, therefore, called binary
operations.
Consider the Two relations STUDENT, QUARTERLY. The STUDENT relation is used to describe the
complete personal information about student, his roll no, name, date of birth, 2nd language. Another
relation QUARTERLY used to describe the students marks in 3 subjects with roll no's.
STUDENT
Roll-No Name Date-of-Birth Second-Language
1 Sunny 01-07-70 Hindi
2 Rashni 15-08-72 Sanskrit
3 Anthra 29-01-71 Hindi
4 Nasreen 31-12-70 Telugu
QUARTERLY
Roll-No Maths Physics Computers
1 72 85 90
2 65 74 68
3 97 94 96
4 87 93 72
The following are queries based on relational algebra to obtain required information from stored
relational database.
1) The SELECT Operation: The Select operation selects tuples that satisfy a given predicate. A lower
case Greek letter sigma (σ) is used to denote Select operation. Predicate appears as subscript to r.
The argument relation is given in parentheses. The General form of selection operation is:
σ predicate (relation)
All comparisons =, #, <, >, <=, >= were allowed in the select operation predicate. Furthermore,
several predicates may be combined into a large predicate using the connectives and () and or (V).
Ex: 1) List out the complete information about all students whose 2nd language is Hindi
σ 2nd-language = Hindi (STUDENT)
Result of the above Query is:
Roll-No Name Date-of-Birth Second-Language
1 Sunny 01-07-70 Hindi
3 Anthra 29-01-71 Hindi
Ex: 2) Display all students with Roll no with their marks who secured more than 90 in all the three
subjects
σ ((Maths > 90) (Physics > 90) (Computers > 90)) (QUARTERLY)
Result of the above Query is:
Roll-No Maths Physics Computers
3 97 94 96
2) The PROJECT Operation: The projection of a relation is defined as a projection of all its tuples
over some set of attributes. i.e., it yields a "vertical subset" of the relation. The projection operation
is used to either reduce the number of attributes in the resultant relation or to reorder attributes.
Projection is denoted by Greek letter pi (Π). We list these attributes that we wish to appear in the
result as a subscript to. The argument relation follows in parenthesis.
General form of projection operation is
Π
List-of-attributes (Predicate) (relation)
Ex: 1) List out all Roll Nos. and their Computer Marks
Π Roll No, Computers (QUARTERLY)
Result of the above Query is:
Roll-No Computers
1 90
2 68
3 96
4 72
2) Display all the student names with their date of birth whose 2nd language is Hindi.
Name Date-of-Birth
Sunny 01-07-70
Anthra 29-01-71
3) What is the Date of Birth of Rashni?
Π Date-of-Birth (σ Name = Rashni) (STUDENT)
Result of the above query is:
Date-of-Birth
15-08-72
4) Find all Roll Nos who secured more than 90 marks in Computers
Π Roll-No (σ computers > 90) (QUARTERLY)
Result of the above query is:
Roll-No
1
3
3) The RENAME Operation: Unlike relations in the DB, the results of relational-algebra expressions
do not have a name that we can use to refer to them. It is useful to be able to give them names; the
rename operator, denoted by the lowercase Greek letter rho (ρ), lets us do this. Given a relational-
algebra expression E, the following expression returns the result of expression E under the name x.
ρ x (E)
Ex: ρ teacher (instructor)
A relation r by itself is considered a (trivial) relational-algebra expression. Thus, we can also apply
the rename operation to a relation r to get the same relation under a new name.
A Second form of the rename operation is as follows: Assume that a relational algebra expression E
has arity n. Then, the following expression returns the result of expression E under the name x, and
with the attributes renamed to A1, A2, . . . , An.
ρ x(A1,A2,...,An) (E)
Ex: ρ teacher (id, name, sal) (instructor)
If there are m tuples in relation r1, and n tuples in relation r2, then there is m x n ways of choosing a pair
of tuples. One tuple from each relation is chosen, so there are n1 x n2 tuples in r.
To list out the Name, Computer Marks, we have to refer both the relations, STUDENT &
QUARTERLY. Student name is an attribute from STUDENT relation and Computers is an attribute
from QUARTERLY relation. Referring 2 relations is denoted by "X"
STUDENT X QUARTERLY
Roll-No Name Date-of-Birth Second-Language Roll-No Maths Physics Computers
1 Sunny 01-07-70 Hindi 1 72 85 90
1 Sunny 01-07-70 Hindi 2 65 74 68
1 Sunny 01-07-70 Hindi 3 97 94 96
1 Sunny 01-07-70 Hindi 4 87 93 72
2 Rashni 15-08-72 Sanskrit 1 72 85 90
2 Rashni 15-08-72 Sanskrit 2 65 74 68
2 Rashni 15-08-72 Sanskrit 3 97 94 96
2 Rashni 15-08-72 Sanskrit 4 87 93 72
3 Anthra 29-01-71 Hindi 1 72 85 90
3 Anthra 29-01-71 Hindi 2 65 74 68
3 Anthra 29-01-71 Hindi 3 97 94 96
3 Anthra 29-01-71 Hindi 4 87 93 72
4 Nasreen 31-12-70 Telugu 1 72 85 90
4 Nasreen 31-12-70 Telugu 2 65 74 68
4 Nasreen 31-12-70 Telugu 3 97 94 96
4 Nasreen 31-12-70 Telugu 4 87 93 72
Information is retrieved from the above relation STUDENT X QUARTERLY. By selecting the common
attribute in the same relation i.e., in given two relations, Roll No is the common attribute.
STUDENT.ROLLNO = QUARTERLY.ROLLNO
Π Name, Computers (σ STUDENT.ROLL-NO = QUARTERLY.ROLL-NO) (STUDENT X QUARTERLY)
Name Computers
Sunny 90
Rashni 68
Anthra 96
Nasreen 72
(ii) Find the Student Roll No, Date of Birth, 2nd Language, Maths, Physics and Computer Marks.
The relations r and s must be of the same arity. i.e., they must have the same number of attributes.
The domains of the ith attribute of r and the ith attribute of s must be the same, for all i.
Note that r and s can be either database relations or temporary relations that are the result of relational
algebra expressions.
As an example, consider the relations CULTURAL (name, class) and SPORTS (name, class). These
two relations represent information about all cultural competition winners & sports winners separately.
CULTURAL SPORTS
Ex: Find all the student Names of MPC III who won cultural competition or sports competition or both
competitions.
Π Name (σ Class = ‘MPC III’) (CULTURAL) Π Name (σ Class = MPC III) (SPORTS)
6) SET-DIFFERENCE Operation: The difference between two relations r and s is r - s. The result
relation contains the set of tuples belonging to r and not in s.
Ex: Find Student Names of MPC III who won cultural prizes but not sports.
Π Name (σ Class = ‘MPC III’) (CULTURAL) - Π Name (σ Class = MPC III) (SPORTS)
NAME NAME NAME
Kavya Hima Kavya
Zeba Sheela Zeba
Hima
Formal Definition of the Relational Algebra: The fundamental operations of relational algebra allow
us to give a complete definition of an expression in the relational algebra. A basic expression in the
relational algebra consists of either one of the following:
A relation in the database
A constant relation
A constant relation is written by listing its tuples within { }, for example
{(22222, Einstein, Physics, 95000), (76543, Singh, Finance, 80000)}.
A general expression in the relational algebra is constructed out of smaller sub expressions. Let E1 and
E2 be relational-algebra expressions. Then, the following are all relational-algebra expressions:
E1 ∪ E2
E1 − E2
E1 × E2
Ex: List out all the names belonging to MPC III who won both the cultural & sports competition.
Π Name (σ Class = ‘MPC III’) (CULTURAL) Π Name (σ Class = MPC III) (SPORTS)
2) NATURAL JOIN Operation: Natural join is a binary operation that allows us to combine certain
selections and a Cartesian product into one operation. It is denoted by the "join" symbol. Natural
join operation forms a Cartesian product of its two arguments, performs a selection forcing equalities
on those attributes that appear in both relation schemas, and finally removes duplicate columns.
Ex: STUDENT QUARTERLY
This operation performs a Cartesian product (X) of two relations, performs selection equality on those
attributes that appear in both the relation schemas and finally removes duplicate columns. In STUDENT
X QUARTERLY relations ROLLNO is common in both relations.
i.e. STUDENT QUARTERLY becomes:
Roll-No Name Date-of-Birth Second-Language Roll-No Maths Physics Computers
1 Sunny 01-07-70 Hindi 1 72 85 90
2 Rashni 15-08-72 Sanskrit 2 65 74 68
3 Anthra 29-01-71 Hindi 3 97 94 96
4 Nasreen 31-12-70 Telugu 4 87 93 72
Query can be written as
Π Roll No, Name, Date of Birth, 2nd language, Maths, Physics, Computers (STUDENT QUARTERLY)
Now, to find student names and Computer marks, the query will be:
Π Name, Computers (STUDENT QUARTERLY)
Name Computers
Sunny 90
Rashni 68
Anthra 96
Nasreen 72
For relational algebra queries, assignment must always be made to a temporary relation variable. Note
that the assignment operation does not provide any additional power to the algebra. It is, however, a
convenient way to express complex queries.
Aggregate Functions: Aggregate Functions take a collection of values and return a single value as a
result. Aggregate operation in relational algebra is expressed as:
G1, G2, …, Gn G F1( A1), F2( A2),…, Fn(An) (E)
Where, E is any relational algebra expression
Each Fi is an aggregate function
Each Ai is an attribute name
G1, G2 …, Gn is a list of attributes on which to group (can be empty)
Arithmetic Operations and Comparisons with null values return the special truth value unknown or null.
For logical operators with an input as null:
MODIFYING THE DATABASE: The content of the database may be modified using the following
operations:
Deletion
Insertion
Updating
All these operations are expressed using the assignment operator.
Deletion: A delete request is expressed in much the same way as a query. However, instead of
displaying tuples to the user, we remove the selected tuples from the database. We may delete only
whole tuples; we cannot delete values on only particular attributes. In relational algebra, a deletion is
expressed by:
rr-E
where r is a relation and E is a relational algebra query.
Ex: (i) Delete the information of the students whose second language is Hindi
STUDENT STUDENT – (σsecond –language = Hindi) (STUDENT)
(ii) Delete the marks of the students who got less than 75 marks in COMPUTERS.
QUARTERLY QUARTERLY – (σCOMPUTERS < 75) (QUARTERLY)
Insertion: To insert data into a relation, we either specify a tuple to be inserted or write a query
whose result is a set of tuples to be inserted. Obviously, the attribute values for inserted tuples must
be members of the attribute's domain. In relational algebra, an insertion is expressed by :
rr∪E
where r is a relation and E is a relational algebra expression.
Ex: To insert the information of Mathew with Roll. No. 5, Date of Birth 15-9-1971 and Second
Language is Telugu, we write:
Updating: In certain situations we may wish to change a value in a tuple without changing all values in
the tuple. It can be done by using the generalized projection operator.
r Π F1, F2, …, Fn (r)
Each Fi is either the ith attribute of r, if the ith attribute is not updated, or, which gives the new value for
the attribute.
Ex: To increase the salary of each instructor by 10%, we write:
instructor Π sal * 1.10 (instructor)
The term select in relational algebra has a different meaning than the one used in SQL. In relational
algebra, the term select corresponds to what we refer to in SQL as where. We emphasize the different
interpretations here to minimize potential confusion.