DBMS Unit 2
DBMS Unit 2
UNIT – II
Relational Model: Codd’s rule, Logical database design, Structure of relational databases, Relational
Algebra, Fundamental relational algebra operations, Additional relational algebra operations, Extended
relational algebra operations, Null values, Relational calculus, Tuple relational calculus, Domain relational
calculus
COURSE OBJECTIVES:
To get familiar with fundamental concepts of database management such as database design,
database languages, and database-system implementation
COURSE OUTCOMES:
Develop the knowledge of fundamental concepts of database management systems.
Relational Model
CODD’S RULE:
Rule 1: The Information Rule
All information, whether it is user information or metadata, that is stored in a database must be entered as a
value in a cell of a table. It is said that everything within the database is organized in a table layout.
1
UNIT-2-Lecture Notes for BE CSE(DS) III SEM DBMS
2
UNIT-2-Lecture Notes for BE CSE(DS) III SEM DBMS
The goal of Logical Database is to create well-structured tables that reflect the need of the user. The tables
of the Logical database store data in a non-redundant manner and foreign keys will be used in tables so that
relationships among tables and entities will be supported.
With the help of the Logical database, we will read the same data from multiple programs.
A logical database defines the same user interface for multiple programs.
Logical Database ensures the Authorization checks for the centralized sensitive database.
With the help of a Logical Database, Performance is improved. Like in Logical Database we will use joins
instead of multiple SELECT statements, which will improve response time and this will increase the
Performance of Logical Database.
Logical Database provides a particular view of Logical Database tables. A logical database is appropriately
used when the structure of the Database is Large. It is convenient to use flow i.e
SELECT
READ
PROCESS
DISPLAY
Example:
Suppose in a University or College, a HOD wants to get information about a specific student. So for that, he
firstly retrieves the data about its batch and Branch from a large amount of Data, and he will easily get
information about the required Student but didn’t alter the information about it.
3
UNIT-2-Lecture Notes for BE CSE(DS) III SEM DBMS
In a Logical database, we can select meaningful data from a large amount of data.
Logical Database consists of Central Authorization which checks for Database Accesses is Authenticated or
not.
In this Coding, the part is less required to retrieve data from the database as compared to Other Databases.
Access performance of reading data from the hierarchical structure of the Database is good.
Easy to understand user interfaces.
Logical Database firstly check functions which further check that user input is complete, correct, and
plausible.
Logical Database takes more time when the required data is at the last because if that table which is required
at the lowest level then firstly all upper-level tables should be read which takes more time and this slows
down the performance.
In Logical Database ENDGET command doesn’t exist due to this the code block associated with an event
ends with the next event statement.
Data in relational structures is organized as a set of tables, called relationships, consisting of columns and
rows. Each row of the table is a set of related values related to a single object or entity. Each row in a table
can be labeled with a unique identifier called a primary key, and rows from multiple tables can be linked
using foreign keys.
4
UNIT-2-Lecture Notes for BE CSE(DS) III SEM DBMS
The data model in relational databases is defined in advance and is strictly typed
Data is stored in tables consisting of columns and rows
Only one value is allowed at the intersection of each column and row
Each column is named and has a specific type, followed by values from all rows in this column
The columns are arranged in a certain order, which is determined when creating the table
There may not be a single row in the table, but there must be at least one column
Queries to the database return the result in the form of tables.
Table Structure
In relational databases, information is stored in tables linked to each other. The tables themselves consist of:
rows, which are called "records"
columns, which are called "fields" or "attributes"
In each table, each column has a predetermined data type. For example, these types can be:
VARCHAR (string data type)
INTEGER (numeric data type)
DATETIME (date and time data type)
and others
Relation: A relation is usually represented as a table, organized into rows and columns. A relationship
consists of multiple records. For example: student relation which contains tuples and attributes.
Tuple: The rows of a relation that contain the values corresponding to the attributes are called tuples. For
example: in the Student relation there are 5 tuples.
Data Item: The smallest unit of data in the relation is the individual data item. It is stored at the intersection
of rows and columns are also known as cells. For Example: 10112, "Rama" etc are data items in Student
relation.
Domain: It contains a set of atomic values that an attribute can take. It could be accomplish explicitly by
listing all possible values or specifying conditions that all values in that domain must be confirmed. For
example: the domain of gender attributes is a set of data values "M" for male and "F" for female. No database
software fully supports domains typically allowing the users to define very simple data types such as
numbers, dates, characters etc.
Attribute: The smallest unit of data in relational model is an attribute. It contains the name of a column in a
particular table. Each attribute Ai must have a domain, dom(Ai). For example: Stu_No, S_Name,
PHONE_NO, ADDRESS, Gender are the attributes of a student relation. In relational databases a column
5
UNIT-2-Lecture Notes for BE CSE(DS) III SEM DBMS
entry in any row is a single value that contains exactly one item only.
Cardinality: The total number of rows at a time in a relation is called the cardinality of that relation. For
example: In a student relation, the total number of tuples in this relation is3 so the cardinality of a relation is
3. The cardinality of a relation changes with time as more and more tuples get added or deleted.
Degree: The degree of association is called the total number of attributes in a relationship. The relation with
one attribute is called unary relation, with two attributes is known a binary relation and with three attributes
is known as ternary relation. For example: in the Student relation, the total number of attributes is 5, so the
degree of the relations is 5. The degree of a relation does not change with time as tuples get added or deleted.
Relational instance: In the relational database system, the relational instance is represented by a finite set of
tuples. Relation instances do not have duplicate tuples.
Relational schema: A relational schema contains the name of the relation and name of all columns or
attributes.
Relational key: In the relational key, each row has one or more attributes. It can identify the row in the relation
uniquely.
A primary key is a column of a table or a set of columns that helps to recognised/identify every record
present in that table uniquely. Furthermore, there can be only 1(one) primary Key in a table. Also, the
primary Key can’t have identical values repeating for any row. Each value of the primary key has to be
6
UNIT-2-Lecture Notes for BE CSE(DS) III SEM DBMS
different with no repetitions/duplicates.
Candidate keys in DBMS are those attributes that uniquely identify/recognize rows of a table. The Primary
Key(PK) of a table is selected from one of the candidate keys. thus, candidate keys have identical properties
to the primary keys explained above. as a result, there can be more than one candidate key in a table.
Super Key is the set of all the keys that facilitate identifying rows in a table uniquely. this implies that all
those columns of a table that are capable of identifying the other columns of that table uniquely will all be
considered super keys.
Thus, a super Key is the superset of a candidate key . The Primary Key(PK) of a table is picked from the
super key set to be made the table’s identity attribute.
Foreign Key is used to establish relationships between two tables. Furthermore, a foreign key will require
each value in a column or set of columns to match the Primary Key(PK) of the referential table. Thus, foreign
keys help to maintain data & referential integrity.
Composite Key in DBMS is a set of two or more attributes that facilitate or help identify each tuple in a table
uniquely. Furthermore, the attributes in the set may not be unique or distinctive when considered separately.
However, when taken all together, they will ensure/confirm uniqueness.
Alternate key:As explicit on top of, a table can have multiple choices for a primary key(PK). However, it can
choose or select only one. So, all the keys that didn’t become the primary Key are referred to as alternate
keys.
A unique key is a column or set of columns that uniquely identify or determine every record in a table.
Therefore, all values will have to be unique in this Key. thus, a unique Key differs from a primary key(PK)
because it can have only 1(one) null value, whereas a primary Key can’t have any null values.
RELATIONAL ALGEBRA:
The relational algebra is a procedural query language. It consists of a set of operations that take one or two
relations as input and produce a new relation as their result. The fundamental operations in the relational
algebra are select, project, union, set difference, Cartesian product, and rename. In addition to the
fundamental operations, there are several other operations—namely, set intersection, natural join, and
assignment. We shall define these operations in terms of the fundamental operations.
Fundamental Operations
The select, project, and rename operations are called unary operations, because they operate on one relation.
The other three operations operate on pairs of relations and are, therefore, called binary operations.
Consider the Two relations STUDENT, QUARTERLY. The STUDENT relation is used to describe the
complete personal information about student, his roll no, name, date of birth, 2nd language. Another relation
QUARTERLY used to describe the students marks in 3 subjects with roll no's.
STUDENT
7
UNIT-2-Lecture Notes for BE CSE(DS) III SEM DBMS
Roll-No Name Date-of-Birth Second-Language
1 Sunny 01-07-70 Hindi
2 Rashni 15-08-72 Sanskrit
3 Anthra 29-01-71 Hindi
4 Nasreen 31-12-70 Telugu
QUARTERLY
Roll-No Maths Physics Computers
1 72 85 90
2 65 74 68
3 97 94 96
4 87 93 72
The following are queries based on relational algebra to obtain required
informationfromstored relationaldatabase.
1) The SELECT Operation: The Select operation selects tuples that satisfy a given
predicate. Alower case Greek letter sigma (σ) is used to denote Select operation.
Predicate appears as subscript tor.
The argument relation is given in parentheses. The General form of selection operation is:
σ predicate (relation)
All comparisons =, #, <, >, <=, >= were allowed in the select operation predicate.
Furthermore, several predicates may be combined into a large predicate using the
connectives and () and or (V).
Ex: 1) List out the complete information about all students whose 2nd language is Hindi
σ 2nd-language = Hindi
(STUDENT)Result of the
above Query is:
Roll-No Name Date-of-Birth Second-Language
1 Sunny 01-07-70 Hindi
3 Anthra 29-01-71 Hindi
Ex: 2) Display all students with Roll no with their marks who secured more than 90 in
allthe three subjects
σ ((Maths > 90) (Physics > 90) (Computers > 90))
(QUARTERLY) Result of the above Query is:
Roll-No Maths Physics Computers
3 97 94 96
8
UNIT-2-Lecture Notes for BE CSE(DS) III SEM DBMS
(Π). We listthese attributes that we wish to appear in the result as a subscript to.
The argument
relation follows inparenthesis.
General form of projection operation is
Π
List-of-attributes(Predicate)(relation)
Ex: 1) List out all Roll Nos. and their ComputerMarks
Π Roll No, Computers (QUARTERLY)
Result of the above Query is:
Roll-No Computers
1 90
2 68
3 96
4 72
2) Display all the student names with their date of birth whose 2nd language isHindi.
Π Name, Date-of-birth (σ2nd-language =
Hindi ) (STUDENT) Result of the above query is:
Name Date-of-Birth
Sunny 01-07-70
Anthra 29-01-71
3) What is the Date of Birth ofRashni?
Π Date-of-Birth (σ Name = Rashni)
(STUDENT) Result of the above query is:
Date-of-Birth
15-08-72
4) Find all Roll Nos who secured more than 90 marks inComputers
Π Roll-No (σ computers > 90)
(QUARTERLY) Result of the above query is:
Roll-No
1
3
3) The RENAME Operation: Unlike relations in the DB, the results of relational-
algebra expressions do not have a name that we can use to refer to them. It is useful to
be able to give them names; the rename operator, denoted by the lowercase Greek
letter rho (ρ), lets us do this. Given a relational- algebra expression E, the following
expression returnsthe result of expression E under the namex.
ρ x(E)
Ex: ρ teacher (instructor)
9
UNIT-2-Lecture Notes for BE CSE(DS) III SEM DBMS
A relation r by itself is considered a (trivial) relational-algebra expression. Thus, we can
also apply the rename operation to a relation r to get the same relation under a new
name.
A Second form of the rename operation is as follows: Assume that a relational algebra
expression E has arity n. Then, the following expression returns the result of
expressionE under the name x, and with the attributes renamed to A1, A2, . . . , An.
ρ x(A1,A2,...,An) (E)
Ex: ρ teacher (id, name, sal) (instructor)
To list out the Name, Computer Marks, we have to refer both the relations, STUDENT &
QUARTERLY. Student name is an attribute from STUDENT relation and Computers is an
attribute from QUARTERLY relation. Referring 2 relations is denoted by"X"
STUDENT X QUARTERLY
Roll-No Name Date-of-Birth Second-Language Roll-No Maths Physics Comp
uters
1 Sunny 01-07-70 Hindi 1 72 85 90
1 Sunny 01-07-70 Hindi 2 65 74 68
1 Sunny 01-07-70 Hindi 3 97 94 96
1 Sunny 01-07-70 Hindi 4 87 93 72
2 Rashni 15-08-72 Sanskrit 1 72 85 90
2 Rashni 15-08-72 Sanskrit 2 65 74 68
2 Rashni 15-08-72 Sanskrit 3 97 94 96
2 Rashni 15-08-72 Sanskrit 4 87 93 72
3 Anthra 29-01-71 Hindi 1 72 85 90
3 Anthra 29-01-71 Hindi 2 65 74 68
3 Anthra 29-01-71 Hindi 3 97 94 96
3 Anthra 29-01-71 Hindi 4 87 93 72
4 Nasreen 31-12-70 Telugu 1 72 85 90
10
UNIT-2-Lecture Notes for BE CSE(DS) III SEM DBMS
4 Nasreen 31-12-70 Telugu 2 65 74 68
4 Nasreen 31-12-70 Telugu 3 97 94 96
4 Nasreen 31-12-70 Telugu 4 87 93 72
Name Computers
Sunny 90
Rashni 68
Anthra 96
Nasreen 72
(ii) Find the Student Roll No, Date of Birth, 2nd Language, Maths, Physics and Computer
Marks.
σ STUDENT.ROLL NO = QUARTERLY.ROLL NO. (STUDENT X QUARTERLY)
Result of the query is:
The relations r and s must be of the same arity. i.e., they must have the same
numberofattributes.
The domains of the ith attribute of r and the ith attribute of s must be the same, for alli.
Note that r and s can be either database relations or temporary relations that are the
result ofrelational algebra expressions.
As an example, consider the relations CULTURAL (name, class) and SPORTS (name,
class). These two relations represent information about all cultural competition winners &
11
UNIT-2-Lecture Notes for BE CSE(DS) III SEM DBMS
sports winnersseparately.
CULTURAL SPORTS
Ex: Find all the student Names of MPC III who won cultural competition or sports competition
or both competitions.
Π Name (σ Class = ‘MPC III’) (CULTURAL) Π Name (σ Class = MPC III) (SPORTS)
Π Name (σ Class = ‘MPC III’) (CULTURAL) - Π Name (σ Class = MPC III) (SPORTS)
NAME NAME NAME
Kavya Hima Kavya
Zeba Sheela Zeba
Hima
12
UNIT-2-Lecture Notes for BE CSE(DS) III SEM DBMS
E1 −E2
E1 ×E2
Ex: List out all the names belonging to MPC III who won both the cultural & sports
competition.
Π Name (σ Class = ‘MPC III’) (CULTURAL) Π Name (σ Class = MPC III) (SPORTS)
13
UNIT-2-Lecture Notes for BE CSE(DS) III SEM DBMS
4 Nasreen 31-12-70 Telugu 4 87 93 72
Query can be written as
ΠRoll No, Name, Date of Birth, 2nd language, Maths, Physics, Computers
(STUDENTQUARTERLY) Now, to find student names and Computer marks,
thequery willbe:
Π Name,Computers(STUDENT QUARTERLY)
Name Computers
Sunny 90
Rashni 68
Anthra 96
Nasreen 72
14
UNIT-2-Lecture Notes for BE CSE(DS) III SEM DBMS
For example, the expression:
Π ID, name, deptname,salary÷12(instructor) gives the ID, name, deptname, and the monthly salary
of each instructor.
2) AGGREGATION: The second extended relational-algebra operation is the aggregate
operation G, which permits the use of aggregate functions such as min or average, on
sets ofvalues.
Aggregate Functions:Aggregate Functions take a collection of values and return a
single value as a result. Aggregate operation in relational algebra is expressed as:
G1, G2, …, Gn G F1( A1), F2( A2),…, Fn(An) (E)
Where, E is any relational
algebra expression Each Fiis
an aggregate function
Each Aiis an attribute name
G1, G2 …, Gn is a list of attributes on which to group (can
beempty) The Aggregate Functions include:
1. sum: It is used to find the sum of values of an attribute in a relation.
Ex: Gsum(salary)(instructor)
2. avg: It is used to find the average value of an attribute in a relation.
Ex: Gavg(salary)(instructor)
3. count: It is used to find the number of values in an attribute in a relation.
Ex: Gcount(salary)(instructor)
There are cases where we must eliminate multiple occurrences of a value before
computingan aggregate function. If we do want to eliminate duplicates, we use the same
function namesas before, with the addition of the key word “distinct” appended to the end
of the function name (for ex. count-distinct). Now, the above example can also be written
as:
Gcount-distnict(salary)(instructor )
Ex: Gmin(salary)(instructor)
5. max: It is used to find the maximum value in an attribute in arelation.
Ex: Gmax(salary)(instructor)
Result of aggregation does not have aname.
Can use rename operation to give it a name.
For convenience, we permit renaming as part of aggregate operation
NULL VALUES: It is possible for tuples to have a null value, denoted by null, for
some oftheir attributes. null signifies an unknown value or that a value does not exist.
The result of any arithmetic expression involving null isnull.
15
UNIT-2-Lecture Notes for BE CSE(DS) III SEM DBMS
Aggregate functions simply ignore null values
For duplicate elimination and grouping, null is treated like any other value, and two nulls are
assumedto be the same. As an alternative, assume each null is different from each other like
inSQL.
Arithmetic Operations and Comparisons with null values return the special truth value unknown or null.
Relaional calculus:
Relational calculus is a non-procedural query language, and instead of algebra, it uses mathematical predicate
calculus. The relational calculus is not the same as that of differential and integral calculus in mathematics
but takes its name from a branch of symbolic logic termed as predicate calculus. When applied to databases,
it is found in two forms. These are
1. Tuple relational calculus which was originally proposed by Codd in the year 1972
2. Domain relational calculus which was proposed by Lacroix and Pirotte in the year 1977
In first-order logic or predicate calculus, a predicate is a truth-valued function with arguments. When we
replace with values for the arguments, the function yields an expression, called a proposition, which will be
either true or false.
Tuple Relational Calculus (TRC) is a non-procedural query language used in relational database
management systems (RDBMS) to retrieve data from tables. TRC is based on the concept of tuples, which
are ordered sets of attribute values that represent a single row or record in a database table.
TRC is a declarative language, meaning that it specifies what data is required from the database, rather than
how to retrieve it. TRC queries are expressed as logical formulas that describe the desired tuples.
Syntax: The basic syntax of TRC is as follows:
{ t | P(t) }
where t is a tuple variable and P(t) is a logical formula that describes the conditions that the tuples in the
16
UNIT-2-Lecture Notes for BE CSE(DS) III SEM DBMS
result must satisfy. The curly braces {} are used to indicate that the expression is a set of tuples.
For example, let’s say we have a table called “Employees” with the following attributes:
Employee ID
Name
Salary
Department ID
To retrieve the names of all employees who earn more than $50,000 per year, we can use the following TRC
query:
{ t | Employees(t) ∧ t.Salary > 50000 }
In this query, the “Employees(t)” expression specifies that the tuple variable t represents a row in the
“Employees” table. The “∧” symbol is the logical AND operator, which is used to combine the condition
“t.Salary > 50000” with the table selection.
The result of this query will be a set of tuples, where each tuple contains the Name attribute of an employee
who earns more than $50,000 per year.
TRC can also be used to perform more complex queries, such as joins and nested queries, by using additional
logical operators and expressions.
While TRC is a powerful query language, it can be more difficult to write and understand than other SQL-
based query languages, such as Structured Query Language (SQL). However, it is useful in certain
applications, such as in the formal verification of database schemas and in academic research.
Tuple Relational Calculus is a non-procedural query language, unlike relational algebra. Tuple Calculus
provides only the description of the query but it does not provide the methods to solve it.
Domain Relational Calculus is a non-procedural query language equivalent in power to Tuple Relational
Calculus. Domain Relational Calculus provides only the description of the query but it does not provide the
methods to solve it. In Domain Relational Calculus, a query is expressed as,
{ < x1, x2, x3, ..., xn > | P (x1, x2, x3, ..., xn ) }
where, < x1, x2, x3, …, xn> represents resulting domains variables and P (x1, x2, x3, …, xn ) represents the
condition or formula equivalent to the Predicate calculus.
Predicate Calculus Formula:
1. Set of all comparison operators
2. Set of connectives like and, or, not
3. Set of quantifiers
Example:
Table-1: Customer
17
UNIT-2-Lecture Notes for BE CSE(DS) III SEM DBMS
Table-2: Loan
Loan number Branch name Amount
L10 Sub 90
L08 Main 60
Table-3: Borrower
Customer name Loan number
Ritu L01
Debomit L08
Soumya L03
Query-1: Find the loan number, branch, amount of loans of greater than or equal to 100 amount.
{≺l, b, a≻ | ≺l, b, a≻ ∈ loan ∧ (a ≥ 100)}
Resulting relation:
Loan number Branch name Amount
Query-2: Find the loan number for each loan of an amount greater or equal to 150.
{≺l≻ | ∃ b, a (≺l, b, a≻ ∈ loan ∧ (a ≥ 150))}
18
UNIT-2-Lecture Notes for BE CSE(DS) III SEM DBMS
Resulting relation:
Loan number
L01
L03
Query-3: Find the names of all customers having a loan at the “Main” branch and find the loan amount .
{≺c, a≻ | ∃ l (≺c, l≻ ∈ borrower ∧ ∃ b (≺l, b, a≻ ∈ loan ∧ (b = “Main”)))}
Resulting relation:
Customer Name Amount
Ritu 200
Debomit 60
Soumya 150
19