Module-2 Notes
Module-2 Notes
In the relational model, a database is a collection of one or more relations, where each relation is a table
with rows and columns. The major advantages of the relational model over the older data models are its
simple data representation and the ease with which even complex queries can be expressed.
Terminology
Relation: A relation is a table with columns and rows.
Domain: A domain is the set of allowable values for one or more attributes.
A relation consists of a relation schema and a relation instance. The schema specifies the relation’s
name, the name of each field and the domain of each field.
Ex : for Schema:
Students(sid:integer,sname:string,login:string,age:integer,gpa:real)
An instance is the snapshot of the database. An instance is a collection of information stored in the
database at particular instant of time.
FIELDS(ATTRIBUTES,COLUMNS)
Page1
DATABASEMANAGEMENTSYSTEMS Module-II
1. Domain Constraints
A relation schema specifies the domain of each field in the relation instance. These domain
constraints in the schema specify the condition that each instance of the relation has to satisfy:
The values that appear in a column must be drawn from the domain associated with that column.
Thus, the domain of a field is essentially the type of that field.
2. Key Constraints
A Key Constraint is a statement that a certain minimal subset of the fields of a relation is a
unique identifier for a tuple.
Super Key:
An attribute, or set of attributes, that uniquely identifies a tuple with in a relation.
However, a super key may contain additional attributes that are not necessary for a unique
identification.
Ex: The customer_id of the relation customer is sufficient to distinguish one tuple from other
.Thus, customer_idis a super key. Similarly, the combination of customer_id and customer_name
is a super key for the relation customer. Here the customer_name is not a super key, because
several people may have the same name.
Weareofteninterestedinsuperkeysforwhichnopropersubsetisasuperkey.Suchminimalsuperkeys are
called candidate keys.
Candidate Key:
A superkey such that no proper subset is a super key with in the relation.
i. Two distinct tuples in a legal instance cannot have identical values in all the fields of a
key.
ii. No subset of the set of fields in a candidate key is a unique identifier for a tuple.
Page2
DATABASEMANAGEMENTSYSTEMS Module-II
Distinguish customer tuples, their combination does not form a candidate key ,since the
customer_id alone is a candidate key.
Primary Key:
The candidate key that is selected to identify tuples uniquely within the relation. Out of all the
available candidate keys, a database designer can identify a primary key. The candidate keys that
are not selected as the primary key are called as alternate keys.
Ex: For the student relation, we can choose student_id as the primary key.
Foreign Key:
Foreign keys represent the relationships between tables. A foreign key is a column (or a group of
columns) whose values are derived from the primary key of some other table.
The table in which foreign key is defined is called a Foreign table or Details table. The table that
defines the primary key and is referenced by the foreign key is called the Primary table or
Master table.
i. Records cannot be inserted into a detail table if corresponding records in the master
table do not exist.
ii. Records of the master table cannot be deleted or updated if corresponding records in
the detail table actually exist.
Page3
DATABASEMANAGEMENTSYSTEMS Module-II
General Constraints:
Domain, primary key, and foreign key constraints are considered to be a fundamental part of the
relational data model. Sometimes, however, it is necessary to specify more general constraints.
For example, we may require that student ages be within a certain range of values. Giving such an
IC, the DBMS rejects inserts and updates that violate the constraint.
Current database systems support such general constraints in the form of table constraints and
assertions. Table constraints are associated with a single table and checked whenever that table is
modified. In contrast, assertions involve several tables and are checked whenever any of the
tables is modified.
Ex for table constraint, which ensures always the salary of an employee is above 1000:
Ex for assertion, which enforce a constraint that the number of boats plus the number of sailors
should be less than 100.
A relational database query (query, for short) is a question about the data, and the answer consists of a
new relation containing the result.
For example, we might want to find all students younger than 18 or all students enrolled in DBMS.
Query Languages
A query language is a specialized language for writing queries. This is the language in which a user
requests information from the database.
SQL is the most popular language for a relational DBMS. Consider the instance of the student’s relation
as shown in figure given in page#1 of this unit. We can retrieve the rows corresponding to students who
are younger than 20 with the following query:
Page4
DATABASEMANAGEMENTSYSTEMS Module-II
In addition to selecting a subset of rows, a query can extract a subset of the columns of each selected row.
We can compute the names and logins of students who are younger than 20 with the following query:
There are a number of “pure” languages. The relational algebra is procedural, where as the tuple
relational calculus and domain relational calculus are non-procedural. These query languages are formal,
lacking the “syntactic sugar” of commercial languages, but they illustrate the fundamental techniques for
extracting data from the database.
Page5
DATABASEMANAGEMENTSYSTEMS Module-II
An entity set is mapped to a relation in a straight forward way: Each attribute of the entity set
becomes a column of the table.
Consider the Employees entity set with attributes ssn, name, and lot shown in figure:
name
ssn lot
Employees
The following SQL statement captures the preceding information, including domain constraints and key
information:
CREATE TABLE Employees (ssn INTEGER, name CHAR(20), lot INTEGER, PRIMARY KEY (ssn));
A relationship set, like an entity set, is mapped to a relation in the relational model. To represent a
relationship without key and participation constraints, we must be able to identify each participating
entity and give values to the descriptive attributes of the relationship. Thus, the attributes of the relation
include:
i. The primary key attributes of each participating entity set, as foreign key fields.
Page6
DATABASEMANAGEMENTSYSTEMS Module-II
All the available information about the Works_In table is captured by the following SQL statement:
CREATE TABLE Works_In (ssn INTEGER, did INTEGER, address VARCHAR2(20), since DATE,
PRIMARY KEY (ssn, did, address), FOREIGN KEY (ssn) REFERENCES Employees, FOREIGN
KEY(address) REFERENCES Locations, FOREIGNKEY(did) REFERENCES Departments);
name
lot since did dname budget
ssn
The table corresponding to Manages has the attributes ssn, did, since. However, because each department
has at most one manager, no two tuples can have the same did value but differ on the ssn value. A
consequence of this observation is that did is itself a key for Manages, indeed, the set did, ssn is not a key.
The Manages relation can be defined using the following SQL statement:
CREATE TABLE Manages (ssn INTEGER, did INTEGER, since DATE, PRIMARY KEY (did),
FOREIGN KEY (ssn) REFERENCES Employees, FOREIGN KEY (did) REFERENCES Departments);
A second approach to translate a relationship set with key constraints is often superior because it avoids
creating a separate table for the relationship set.
Page7
DATABASEMANAGEMENTSYSTEMS Module-II
In the Manages example, because a department has at most one manager, we can add the since attribute to
the Departments tuple. This approach eliminates the need for a separate Manages relation. The only
drawback to this approach is that space could be wasted if several departments have no managers.
The following SQL statement defines the Dept_Mgr relation that captures the information in both
Departments and Manages, illustrates the second approach:
CREATE TABLE Dept_Mgr (did INTEGER, dname CHAR(20), budget REAL, ssn INTEGER, since
DATE, PRIMARY KEY(did), FOREIGN KEY (ssn) REFERENCES Employees);
Consider the ER diagram as shown below, which show the two relationship sets, Manages and Workd_In:
name
dname
lot since did
ssn budget
Employees Departments
Manages
Works_In
since
Every department is required to have manager, due to the participation constraint (total participation), and
atmost one manager, due to the key constraint(one-to-many).
CREATE TABLE Dept_Mgr (did INTEGER, dname CHAR(20), budget REAL, ssn INTEGER NOT
NULL, since DATE, PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employees ON
DELETE NO ACTION);
It also captures the participation constraint that every department must have a manager: Because ssn
cannot take on null values, each tuple of Dept_Mgr identifies a tuple in Employees. The NO ACTION
specification, which is the default and need not be explicitly specified, ensures that an Employees tuple
cannot be deleted while It is pointed by a Dept_Mgr tuple.
Page8
DATABASEMANAGEMENTSYSTEMS Module-II
A weak entity set always participation in a one-to-many binary relationship and has a key constraint and
total participation. The second translation approach is ideal in this case, but we must take into account
that the weak entity has only a partial key. Also, when an owner entity is deleted, we want all owned
weak entities to be deleted.
Consider the Dependents weak entity set shown in figure with partial key pname.
cost pname
name age
lot
ssn
A Dependents entity can be identified uniquely only if we take the key of the owning Employees entity
and the pname of the Dependents entity, and the Dependents entity must be deleted if the owning
Employees entity is deleted.
CREATE TABLE Dep_Policy (pname CHAR(20), age REAL, cost REAL, ssn INTEGER, PRIMARY
KEY (pname,ssn), FOREIGN KEY(ssn) REFERENCES Employees ON DELETE CASCADE);
We present the two basic approaches to handle IS A hierarchies in the ER diagram shown below:
name
ssn lot
Employees
Hrs_worked ISA
Contract_id
Hourly_Emps Contract_Emps
Hrly_wages
Page9
DATABASEMANAGEMENTSYSTEMS Module-II
1. We can map each of the entity sets Employees, Hourly_Emps, and Contract_Emps to a distinct
relation. The relation for Hourly_Emps includes the hourly_wages and hours_worked attributes of
Hourly_Emps. It also contains the key attributes of the super class (ssn, in this example), which serves as
the primary key for Hourly_Emps, as well as a foreign key referencing the super class (Employees). Note
that if a super class tuple is deleted, the delete must be cascaded to Hourly_Emps.
2. Alternatively, we can create just two relations, corresponding to Hourly_Emps and Contract_Emps.
The relation for Hourly_Emps includes all the attributes of Hourly_Emps as well as all the attributes of
Employees(i.e .ssn, name, lot, hourly_wages, hours_worked).
The first approach is general and always applicable. The second approach is not applicable if we have
employees who are neither hourly employees nor contract employees, since there is no way to store such
employees. A query that needs to examine all employees must now examine two relations.
name lot
ssn
Employees
Monitors until
The Employees, Projects and Departments entity sets and the Sponsos relationship set are mapped as
described in previous section. For the Monitors relationship set, we create a relation with the following
attributes: the key attributes of Employees (ssn), the key attributes of Sponsors (did, pid), and the
descriptive attributes of Monitors(until).
Page10
DATABASEMANAGEMENTSYSTEMS Module-II
Introduction to Views
A view is a table whose rows are not explicitly stored in the database but are computed as needed from a
View definition.
Creating a View
Consider the Students and Enrolled relations. Suppose we are interested in finding the names and student
ids of students who got a grade of B in some course, together with the course identifier. We can define the
view for this as follows:
CREATE VIEW B_Students (name, sid, course) AS SELECT S.sname, S.sid, E.cid FROM Students S,
Enrolled E WHERE S.sid=E.studid AND E.grade=’B’;
Removing a View
A view is removed from the database with the DROP VIEW statement:
Updatable Views
The SQL allows updates to be specified only on views that are defined on a single base table using just
selection and projection, with no use of aggregate operators. Such views are called updatable views.
Advantages of Views
Data Independence
Improved Security
Reduced Complexity
Page11
DATABASEMANAGEMENTSYSTEMS Module-II
Disadvantages of Views
Update restriction
Performance
Relational Algebra
Relational algebra is one of the two formal query languages associated with the relational model. Queries
in algebra are composed using a collection of operators. A fundamental property is that every operator in
the algebra accepts one or more relational instances as arguments and returns a relation instance as the
result. The relational algebra is procedural.
44 Guppy 5 35.0
22 101 10/10/96
58 103 11/12/96
Page12
DATABASEMANAGEMENTSYSTEMS Module-II
The various operators that form the relational algebraic query are:
In general, the selection operator σ specifies the tuples to get through a selection condition. Here the
selection condition is a Boolean combination of terms that have the form attribute op constant or
attribute1 op attribute2, where op is one of the relational operators <,<=, =,≠, >=, or >.
Consider the instance S2 of the Sailors relation shown in figure, we can find the Sailors with rating above
8 by the following expression:
σrating> 8(S2)
This evaluates to the relation shown below:
28 Yuppy 9 35.0
58 Rusty 10 35.0
The projection operator π allows us to extract columns from a relation. For example, we can find out all
sailors names and ratings by the following expression:
Πsname,rating(S2)
This evaluates to the relation shown below:
sname rating
Yuppy 9
Lubber 8
Guppy 5
Rusty 10
Page13
DATABASEMANAGEMENTSYSTEMS Module-II
Similarly, we can find the names and ratings of sailors with rating above 8 by the following expression:
Πsname,rating(σrating>8 (S2))
This evaluates to the relation shown below:
sname rating
Yuppy 9
Rusty 10
2) Set Operations
The following standard operations on sets are available in relational algebra: union(U), intersection(∩),
set-difference(─), and cross-product (Х).
Union:
RUS returns a relation instance containing all tuples that occur in either relation instance R or relation
instance S (or both).
R and S must be union-compatible, and the schema of the result is identical to the schema of R.
Two relation instances are said to be union-compatible if the following conditions hold:
Corresponding fields, taken in order from left to right, have the same domains
22 Dustin 7 45.0
31 Lubber 8 55.5
58 Rusty 10 35.0
28 Yuppy 9 35.0
44 Guppy 5 35.0
Page14
DATABASEMANAGEMENTSYSTEMS Module-II
Intersection:
R∩S returns a relation instance containing all tuples that occur in both R and S. The relations R and S
must be union-compatible ,and the scheme of the result is same as the schema of R.
31 Lubber 8 55.5
58 Rusty 10 35.0
Set-difference:
R─S returns a relation instance containing all tuples that occur in R but not in S. The relations R and S
must be union-compatible, and the schema of the result is identical to the schema of R.
22 Dustin 7 45.0
Cross-product:
RХ S returns a relation instance whose schema contains all the fields of R followed by all the fields of S.
The cross product operation is sometimes called as Cartesian product.
Page15
DATABASEMANAGEMENTSYSTEMS Module-II
3) Renaming
The result of the relational algebra expression includes the field names in such a way that naming
conflicts can arise in some cases. For example, in S1 Х R1. Hence we have to rename the fields or rename
the relation. Relation algebra provides renaming operator ρ for this purpose.
The expression ρ (R (F), E) takes a relational algebra expression E and returns an instance of a relation R.
R contains the same tuples as the result of E and has the same schema as E, but some fields are renamed.
F is the list of fields renamed and is in the form oldname ─>newname or position─>newname.
For example, the expression ρ (C (1 ─>sid1, 5 ─> sid2), S1 Х R1) returns a relation with the following
schema:
4) Joins
The join operation is used to combine the information from two or more relations. Join can be defined as
a cross-product followed by selection and projection. There are several variants of join operation:
Conditional Join:
The most general version of the join operation accepts a join condition c and a pair of relational instances
as arguments and returns a relation instance. The operation is defined as follows:
R CS= σc(RХS)
Thus is defined to be a cross-product followed by a selection.
below:
Equijoin:
A common special case of the join operation R S is when the join condition consists solely of
equalities of the form R.name1 = S.name2, that is, equalities between two fields in R and S. The join
operation with such equality condition is called equijoin.
Page16
DATABASEMANAGEMENTSYSTEMS Module-II
Note that the fields in the equijoin condition appears only once in the resultant instance.
Natural Join:
A further special case of the join operation R S is an equijoin in which equalities are specified on all
common fields of R and S. In this case, we can simply omit the join condition. By default, the equality
condition is employed on all common fields.
We call this as natural join and can simply be denoted as S1 R1. The result of this expression is same
as above, since the only common field is sid.
5) Division
We discuss division through an example. Consider two relation instances A and B in which A has two
fields x and y and B has just one filed y, with the same domain as in A. We define the division operation
A/B as the set of all x values such that for every value in B, there is a tuple <x,y> in A.
Division is illustrated in figure below. Consider the relation A listing the parts(pid) supplied by
suppliers(sid) and the relation B listing the parts(pid). A/Bi computes suppliers who supply all parts listed
in relation instance Bi.
Page17
DATABASEMANAGEMENTSYSTEMS Module-II
A A/B1 Sno
Sno pno
Pno S1
S1 P1
B1 P2 S2
S1 P2
S3
S1 P3
Sno S4
S1 P4 B2
P2
S2 P1
P4 A/B2 Sno
S2 P2
B3 S1
S3 P2 Pno
S4
S4 P2 P1
S4 P4 P2 A/B3
Sno
P4
S1
Relational Calculus
Relational calculus is an alternative to relational algebra. In contrast to the algebra, which
Is procedural, the calculus is non-procedural or declarative.
A tuple variable is a variable that takes on tuples of a particular relation schema as values. A
TRC query has the form { T / p(T) }, where T is a tuple variable and p(T) denotes a formula that
describes T.
Page18
DATABASEMANAGEMENTSYSTEMS Module-II
RЄRel
R.a op S.b
R.a op constant, or constant op R.a
Where p and q are themselves are formulas and p(R) denotes a formula with the variable R.
Example Relations:
Page19
DATABASEMANAGEMENTSYSTEMS Module-II
1. Find the loan-number, branch-name, and amount for loans of over $1200
{t | tloant [amount] 1200}
2. Find the loan number for each loan of an amount greater than $1200
{t | s loan (t[loan-number] = s[loan-number] s [amount] 1200)}
3. Find the names of all customers having a loan, an account, or both at the bank
{t | s borrower( t[customer-name] = s[customer-name])u depositor( t[customer-name] = u[customer-
name])
4. Find the names of all customers who have a loan and an accountat the bank
{t | s borrower( t[customer-name] = s[customer-name])u depositor( t[customer-name] = u[customer-
name])
5. Find the names of all customers having a loan at the Perryridge branch
{t | s borrower(t[customer-name] = s[customer-name] u loan(u[branch-name] = “Perryridge”
u[loan-number] = s[loan-number]))}
6. Find the names of all customers who have a loan at the Perryridge branch, but no account at any branch of
the bank
{t | s borrower( t[customer-name] = s[customer-name]u loan(u[branch-name] = “Perryridge”
u[loan-number] = s[loan-number])) notv depositor (v[customer-name] = t[customer-name]) }
Page20
DATABASEMANAGEMENTSYSTEMS Module-II
Example Queries:
1. Find the loan-number, branch-name, and amount for loans of over $1200
{l, b, a | l, b, a loana > 1200}
2. Find the names of all customers who have a loan of over $1200
{c | l, b, a (c, l borrower l, b, a loana> 1200)}
3. Find the names of all customers who have a loan from the Perryridge branch and the loan amount:
{c, a | l (c, lborrower b(l, b, a loanb = “Perryridge”))}
or
{c, a | l (c, lborrower l, “Perryridge”, a loan)}
4. Find the names of all customers having a loan, an account, or both at the Perryridge branch:
{c | l ({c, lborrower b,a(l, b, a loanb = “Perryridge”)) a(c, adepositor
b,n(a, b, n accountb = “Perryridge”))}
5. Find the names of all customers who have an account at all branches located in Brooklyn:
{c | s, n (c, s, n customer) x,y,z(x, y, z branchy = “Brooklyn”) a,b(x, y, z
accountc,adepositor)}
Page21