Basic SQL Relational Algebra Operations
Basic SQL Relational Algebra Operations
The result of a retrieval is a new relation, which may have been formed from one or more relations.
The algebra operations thus produce new relations, which can be further manipulated using
operations of the same algebra.
A sequence of relational algebra operations forms a relational algebra expression, whose result
will also be a relation that represents the result of a database query (or retrieval request.
Basic Relational Algebra Operations
(Unary Relational Operations:)
SELECT Operation
SELECT operation is used to select a subset of the tuples from a relation that satisfy a selection condition. It
is a filter that keeps only those tuples that satisfy a qualifying condition – those satisfying the condition are
selected while others are discarded.
Example: To select the EMPLOYEE tuples whose department number is four or those
whose salary is greater than $30,000 the following notation is used:
DNO = 4 (EMPLOYEE)
SALARY > 30,000 (EMPLOYEE)
(R)
<selection condition>
where the symbol (sigma) is used to denote the select operator, and the selection condition is a Boolean
expression specified on the attributes of relation R.
PROJECT Operation
This operation selects certain columns from the table and discards the other columns. The
PROJECT creates a vertical partitioning – one with the needed columns (attributes) containing results of
the operation and other containing the discarded Columns.
Example: To list each employee’s first and last name and salary, the following is used:
LNAME, FNAME,SALARY(EMPLOYEE)
The general form of the project operation is <attribute list>(R) where where (pi) is the
symbol used to represent the project operation and <attribute list> is the desired list of attributes from
the attributes of relation R.
The project operation removes any duplicate tuples, so the result of the project operation is a set
of tuples and hence a valid relation.
Rename Operation
The RENAME operation is used to rename the output of a relation. Sometimes it is simple and
suitable to break a complicated sequence of operations and rename it as a relation with different
names.
Example-3: Query to rename the table name Project to Pro and its attributes to P, Q, R.
ρ Pro(P, Q, R) (Project)
Example-1: Query to rename the relation Student as Male Student and the attributes of
Student – RollNo, SName as (Sno, Name).
***************************************************************************
The SQL Set operation is used to combine the two or more SQL SELECT statements.
The union operation eliminates the duplicate rows from its result set.
Syntax
SELECT column_name FROM table1
UNION
SELECT column_name FROM table2;
Syntax:
SELECT column_name FROM table1
UNION ALL
SELECT column_name FROM table2;
Note :Since MySQL does not provide support for the INTERSECT operator.
However, we can use the INNER JOIN and IN clause to emulate this operator.
Syntax
SELECT column_name FROM table1
INTERSECT
SELECT column_name FROM table2
sql syntax:
SELECT * FROM first
INTERSECT
SELECT * FROM second;
+----+---------+----+---------+
| id | name | id | name |
+----+---------+----+---------+
| 1 | ramu | 1 | ramu |
| 3 | jill | 3 | jill |
| 5 | sridevi | 5 | sridevi |
+----+---------+----+---------+
3 rows in set (0.00 sec)
--------------------------------------------------------------------
4. Minus
It combines the result of two SELECT statements. Minus operator is used to display the
rows which are present in the first query but absent in the second query.
It has no duplicates and data arranged in ascending order by default.
Let R and S be two relations.
Then-
R – S is the set of all tuples belonging to R and not to S.
In R – S, duplicates are automatically removed.
Difference operation is associative but not commutative.
Syntax:
SELECT column_name FROM table1
MINUS
SELECT column_name FROM table2;
Example
Using the above First and Second table.
Minus query in SQL will be:
+----+---------+------+---------+
| id | name | id | name |
+----+---------+------+---------+
| 1 | ramu | 1 | ramu |
| 2 | jack | NULL | NULL |
| 3 | jill | 3 | jill |
| 5 | sridevi | 5 | sridevi |
+----+---------+------+---------+----
4 rows in set (0.00 sec)
5. Division
Important: Division is not supported by SQL implementations. However, it can be represented using other operations.(like
cross join, Except, In )
***************************************************************************************
***********
JOINS
Types of Join
There are mainly two types of joins in DBMS:
Inner Join
Inner Join is used to return rows from both tables which satisfy the given condition. It is the most widely used join
operation and can be considered as a default join-type
An Inner join or equijoin is a comparator-based join which uses equality comparisons in the join-predicate.
However, if you use other comparison operators like “>” it can’t be called equijoin.
Theta join
Natural join
EQUI join
Theta Join
Theta Join allows you to merge two tables based on the condition represented by theta. Theta joins work for all
combines. In relational database systems, joins are fundamental for retrieving data from multiple tables. The
Theta Join is a versatile join type that allows for a broader range of conditions compared to the more commonly
used Inner Join, which strictly relies on equality conditions.
SELECT *
FROM TableA A
JOIN TableB B
ON A.column_name operator B.column_name;
Here, operator can be any comparison operator such as <, >, <=, >=, !=, or =. The result of a Theta Join will
include all combinations of rows from both tables where the specified condition holds true.
Example Scenario
Employees Table:
EmployeeI
Name Salary
D
1 Alice 70000
2 Bob 50000
3 Charlie 60000
Departments Table:
Result Set
The result of the above query would yield:
Name DepartmentName
Alice HR
Charlie IT
Bob Sales
Performance Considerations
While Theta Joins are powerful, they can also lead to performance issues, especially with large datasets. The
database engine must evaluate the join condition for every combination of rows from the participating tables,
which can result in a Cartesian product if not properly constrained.
To optimize performance:
Indexes: Ensure that the columns involved in the join condition are indexed.
Filtering: Apply additional filtering conditions in the WHERE clause to reduce the number of rows
processed.
Join Order: Consider the order of joins, as it can significantly impact performance.
Conclusion
Theta Joins provide a flexible way to combine data from multiple tables based on various conditions. By
understanding how to implement and optimize Theta Joins, database professionals can enhance their querying
capabilities and improve the efficiency of data retrieval operations. As with any join operation, careful
consideration of the join conditions and performance implications is essential for effective database management.
⋈ B
Syntax:
A θ
For example:
A ⋈ A.column 2 > B.column 2 (B)
In this example, the DepartmentID column serves as the common attribute between the two tables. An EQUI Join
will allow us to retrieve a list of employees along with their respective department names.
In this query:
We are selecting the EmployeeID and EmployeeName from the Employees table (aliased as A) and
the DepartmentName from the Departments table (aliased as B).
The JOIN clause specifies that we want to combine records from both tables where
the DepartmentID matches.
A ⋈
For example:
column 1 column 2
1 1
Points to remember:
o There is no need to specify the column names to join.
o The resultant table always contains unique columns.
o It is possible to perform a natural join on more than two tables.
o Do not use the ON clause.
Syntax:
The following is a basic syntax to illustrate the natural join:
SELECT [column_names | *]
FROM table_name1
NATURAL JOIN table_name2;
In this syntax, we need to specify the column names to be included in the result set
after the SELECT keyword. If we want to select all columns from both tables,
the * operator will be used. Next, we will specify the table names for joining after the
FROM keyword and write the NATURAL JOIN clause between them.
Example:
C
Num Square
2 4
3 9
D
Num Cube
2 8
C ⋈ D
3 18
C⋈D
Num Square Cube
2 4 8
3 9 18
Outer Join
An Outer Join doesn’t require each record in the two join tables to have a matching record. In this type of join, the
table retains each record even if no other matching record exists.
Syntax
1. SELECT table1.column1, table1.column2, table2.column1,....
2. FROM table1
3. LEFT JOIN table2
4. ON table1.matching_column = table2.matching_column;
A
Num Square
2 4
3 9
4 16
B
Num Cube
2 8
3 18
5 75
A⋈B
A B
2 4 8
3 9 18
4 16 –
In our example, let’s assume that you need to get the names of members and movies rented by them. Now we
have a new member who has not rented any movie yet.
A⋈B
A B
2 8 4
3 18 9
5 75 –
Syntax
1. SELECT table1.column1, table1.column2, table2.column1,....
2. FROM table1
3. FULL JOIN table2
4. ON table1.matching_column = table2.matching_column;
Example:
A B
A⋈B
2 4 8
3 9 18
4 16 –
5 – 75
Summary
There are mainly two types of joins in DBMS 1) Inner Join 2) Outer Join
An inner join is the widely used join operation and can be considered as a default join-type.
Inner Join is further divided into three subtypes: 1) Theta join 2) Natural join 3) EQUI join
Theta Join allows you to merge two tables based on the condition represented by theta
When a theta join uses only equivalence condition, it becomes an equi join.
Natural join does not utilize any of the comparison operators.
An outer join doesn’t require each record in the two join tables to have a matching record.
Outer Join is further divided into three subtypes are: 1)Left Outer Join 2) Right Outer Join 3) Full Outer
Join
The LEFT Outer Join returns all the rows from the table on the left, even if no matching rows have been
found in the table on the right.
The RIGHT Outer Join returns all the columns from the table on the right, even if no matching rows have
been found in the table on the left.
In a full outer join, all tuples from both relations are included in the result, irrespective of the matching
condition.
*****************************************************************************
Aggregate Functions
MySQL's aggregate function is used to perform calculations on multiple values
and return the result in a single value like the average of all values, the sum
of all values, and maximum & minimum value among certain groups of values. We
mostly use the aggregate functions with SELECT statements in the data query
languages.
In database management an aggregate function is a function where the values of multiple rows
are grouped together as input on certain criteria to form a single value of more significant
meaning.
Syntax:
The following are the syntax to use aggregate functions in MySQL:
1. function_name (DISTINCT | ALL expression)
In the above syntax, we had used the following parameters:
o First, we need to specify the name of the aggregate function.
o Second, we use the DISTINCT modifier when we want to calculate the result
based on distinct values or ALL modifiers when we calculate all values,
including duplicates. The default is ALL.
o Third, we need to specify the expression that involves columns and arithmetic
operators.
There are various aggregate functions available in MySQL. Some of the most
commonly used aggregate functions are summarised in the below table:
Aggregate Descriptions
Function
We mainly use the aggregate functions in databases, spreadsheets and many other
data manipulation software packages. In the context of business, different
organization levels need different information such as top levels managers interested
in knowing whole figures and not the individual details. These functions produce the
summarised data from our database. Thus they are extensively used in economics
and finance to represent the economic health or stock and sector performance.
:
Count() Function
MySQL count() function returns the total number of values in the expression. This
function produces all rows or only some rows of the table based on a specified
condition, and its return type is BIGINT. It returns zero if it does not find any matching
rows. It can work with both numeric and non-numeric data types.
COUNT(*)
or
COUNT( [ALL|DISTINCT] expression )
SELECT COUNT(*)
FROM PRODUCT_MAST;
-------------------------------------------------------------------------------------------------------------------------
--------
Sum() Function
The MySQL sum() function returns the total summed (non-NULL) value of an
expression. It returns NULL if the result set does not have any rows. It works with
numeric data type only.
SUM()
or
SUM( [ALL|DISTINCT] expression )
Suppose we want to calculate the total number of working hours of all employees in
the table, we need to use the sum() function as shown in the following query:
1. mysql> SELECT SUM(working_hours) AS "Total working hours" FROM employee;
2. ----------------------------------------------------------------------------------------------------------
AVG() Function
MySQL AVG() function calculates the average of the values specified in the
column. Similar to the SUM() function, it also works with numeric data type only.
AVG()
or
AVG( [ALL|DISTINCT] expression )
Suppose we want to get the average working hours of all employees in the table, we
need to use the AVG() function as shown in the following query:
MIN() Function
MySQL MIN() function returns the minimum (lowest) value of the specified
column. It also works with numeric data type only.
MIN()
or
MIN( [ALL|DISTINCT] expression )
MAX() Function
MySQL MAX() function returns the maximum (highest) value of the specified
column. It also works with numeric data type only.
MAX()
or
MAX( [ALL|DISTINCT] expression )
Suppose we want to get maximum working hours of an employee available in the
table, we need to use the MAX() function as shown in the following query:
mysql> SELECT MAX(working_hours) AS Maximum_working_hours FROM employee;
-------------------------------------------------------------------------
LAST() Function
This function returns the last value of the specified column. To get the last value of
the column, we must have to use the ORDER BY and LIMIT clause. It is because the
LAST() function only supports in MS Access.
Suppose we want to get the last working hour of an employee available in the table,
we need to use the following query:
mysql> SELECT working_hours FROM employee ORDER BY name DESC LIMIT 1;
***************************************************************************************
************************************************
MySQL Subquery
A subquery in MySQL is a query, which is nested into another SQL query and
embedded with SELECT, INSERT, UPDATE or DELETE statement along with the various
operators. We can also nest the subquery with another subquery.
A subquery is known as the inner query, and the query that contains subquery is
known as the outer query. The inner query executed first gives the result to the
outer query, and then the main/outer query will be performed.
MySQL allows us to use subquery anywhere, but it must be closed within parenthesis.
All subquery forms and operations supported by the SQL standard will be supported in
MySQL also.
The following are the rules to use subqueries:
o Subqueries should always use in parentheses.
o If the main query does not have multiple columns for subquery, then a
subquery can have only one column in the SELECT command.
o We can use various comparison operators with the subquery, such as >, <, =,
IN, ANY, SOME, and ALL. A multiple-row operator is very useful when the
subquery returns more than one row.
o We cannot use the ORDER BY clause in a subquery, although it can be used
inside the main query.
o If we use a subquery in a set function, it cannot be immediately enclosed in a
set function.
The below SQL statements uses EXISTS operator to find the name, occupation, and
age of the customer who has placed at least one order.
This statement uses NOT EXISTS operator that returns the customer details who have
not placed an order.
If we use ALL in place of ANY, it will return TRUE when the comparison is TRUE for ALL
values in the column returned by a subquery. For example:
))))))))))))))))))))))))))))))))))
Overview
Nested query is one of the most useful functionalities of SQL. Nested queries are useful when we
want to write complex queries where one query uses the result from another query. Nested queries
will have multiple SELECT statements nested together. A subquery is a SELECT statement nested
within another SELECT statement.
The IN operator checks if a column value in the outer query's result is present in the inner query's result. The final result
will have rows that satisfy the IN condition.
The NOT IN operator checks if a column value in the outer query's result is not present in the inner query's result. The
final result will have rows that satisfy the NOT IN condition.
The ALL operator compares a value of the outer query's result with all the values of the inner query's result and returns
the row if it matches all the values.
The ANY operator compares a value of the outer query's result with all the inner query's result values and returns the
row if there is a match with any value.
The SELECT query inside the brackets () is the inner query, and the SELECT query outside the
brackets is the outer query. The outer query uses the result of the inner query.
Employees
id name salary role
Awards
id employee_id award_date
1 1 2022-04-01
2 3 2022-05-01
id employee_id award_date
Example 1: IN
Select all employees who won an award.
SELECT id, name FROM Employees
WHERE
id IN (SELECT employee_id FROM Awards);
Output
id name
1 Augustine Hammond
Cassy Delafoy
3
Example 2: NOT IN
Select all employees who never won an award.
SELECT id, name FROM Employees
WHERE id NOT IN (SELECT employee_id FROM Awards);
Output
id name
2 Perice Mundford
4 Garwood Saffen
5 Faydra Beaves
Example 3: ALL
Select all Developers who earn more than all the Managers
SELECT * FROM Employees
WHERE role = 'Developer'
AND salary > ALL (
SELECT salary FROM Employees WHERE role = 'Manager'
);
Output
id name salary role
Output
id name salary role
Explanation
The developers with id 3 and 5 earn more than any manager:
The developer with id 3 earns (30000) more than the manager with id 2 (10000)
The developer with id 5 earns (50000) more than the managers with id 2 (10000) and 4 (40000)
Output
id name salary role
Explanation
The manager with id 4 earns more than the average salary of all managers (25000), and the
developer with id 5 earns more than the average salary of all developers (30000). The inner query is
executed for all rows fetched by the outer query. The inner query uses the role value (emp1.role)
of every outer query's row (emp1.role = emp2.role).
We can find the average salary of managers and developers using the below query:
SELECT role, AVG(salary)
FROM Employees
GROUP BY role;
role avg(salary)
Developer 30000
Manager 25000
************************
views
Views in DBMS (Database Management Systems) are virtual tables that are
derived from the result of a query. They provide a way to present data from
one or more tables in a customized manner without actually modifying the
underlying tables. Views are widely used in database systems to simplify
complex queries, enhance security, and improve performance.
Definition of Views
A view is a logical representation of data that is stored in one or more tables. It
is defined by a query that retrieves data from the underlying tables and
presents it as a virtual table. The view itself does not store any data; it is just a
saved query that can be executed to retrieve the desired data.
Syntax for Creating Views
To create a view in DBMS, you can use the CREATE VIEW statement followed
by the view name and the query that defines the view. The basic syntax for
creating a view is as follows:
CREATE VIEW view_name AS
SELECT column1, column2, ...
FROM table_name
WHERE condition;
In this syntax:
view_name is the name of the view you want to create.
column1, column2, ... are the columns you want to include in the view.
table_name is the name of the table(s) from which you want to retrieve
data.
condition is an optional condition that filters the data retrieved from the
table(s).
Example of Creating a View
Let's consider a scenario where we have two
tables: employees and departments. The employees table contains information
about employees, such as their names, salaries, and department IDs.
The departments table contains information about departments, such as their
names and IDs.
We can create a view called employee_details that combines data from both
tables to provide a consolidated view of employee information. Here's an
example of how to create this view:
CREATE VIEW employee_details AS
SELECT e.name, e.salary, d.department_name
FROM employees e
JOIN departments d ON e.department_id = d.department_id;