What Is Relational Algebra
What Is Relational Algebra
Relational algebra consists of a certain set of rules or operations that are widely used to
manipulate and query data from a relational database
Fundamental Operators
The SQL UNION operator combines the results of two or more SELECT statements into one
result set. By default, UNION removes duplicate rows, ensuring that the result set contains
only distinct records.
Emp1 Table
Name VARCHAR(50),
Country VARCHAR(50),
Age int(2),
mob int(10)
);
Output:
Emp1 Table
Emp2 Table
Name VARCHAR(50),
Country VARCHAR(50),
Age int(2),
mob int(10)
);
Output:
Emp2 Table
In this example, we will find the cities (only unique values) from both the “Table1” and the
“Table2” tables:
Query:
UNION
ORDER BY Country;
Output:
2.INTERSECTION
the INTERSECT clause is used to retrieve the common records between two SELECT queries.
This makes INTERSECT an essential clause when we need to find overlapping data between
two or more queries.
Let’s consider two tables: the Customers table, which holds customer details, and
the Orders table, which contains information about customer purchases. By applying
the INTERSECToperator, we can retrieve customers who exist in both tables, meaning those
who have made purchases.
Customers Table
Customers Table
Orders Table
Orders Table
In this example, we retrieve customers who exist in both the Customers and Orders tables.
The INTERSECT operator ensures that only those customers who have placed an order
appear in the result.
Query:
SELECT CustomerID
FROM Customers
INTERSECT
SELECT CustomerID
FROM Orders;
Output:
CustomerID
3.JOIN
EMPLOYEE
EMP_ID EMP_NAME
101 Stephan
102 Jack
103 Harry
SALARY
EMP_ID SALARY
101 50000
102 30000
103 25000
Result:
Theta join combines tuples from different relations provided they satisfy the theta condition.
The join condition is denoted by the symbol θ.
Notation
R1 ⋈θ R2
Natural join does not use any comparison operator. It does not concatenate the way a
Cartesian product does.
Natural join acts on those matching attributes where the values of attributes in both the
relations are same.
3. Outer Joins
Therefore, we need to use outer joins to include all the tuples from the participating
relations in the resulting relation. There are three kinds of outer joins − left outer join,
right outer join,Full outer join
EMPLOYEE
FACT_WORKERS
Input:
1. (EMPLOYEE ⋈ FACT_WORKERS)
Output:
o Left outer join contains the set of tuples of all combinations in R and S that are
equal on their common attribute names.
o It is denoted by ⟕.
Input:
1. EMPLOYEE ⟕ FACT_WORKERS
o Right outer join contains the set of tuples of all combinations in R and S that are
equal on their common attribute names.
o It is denoted by ⟖.
Example: Using the above EMPLOYEE table and FACT_WORKERS Relation
Input:
1. EMPLOYEE ⟖ FACT_WORKERS
Output:
o Full outer join is like a left or right join except that it contains all rows from both
tables.
o In full outer join, tuples in R that have no matching tuples in S and tuples in S that
have no matching tuples in R in their common attribute name.
o It is denoted by ⟗.
Input:
1. EMPLOYEE ⟗ FACT_WORKERS
Output:
Characteristics of DBMS
Some well-known characteristics are present in the DBMS (Database Management
System). These are explained below.
o The reality of DBMS (Database Management System) is one of the most important
and easily understandable characteristics. The DBMS (Database Management
System) is developed in such a way that it can manage huge business organizations
and store their business data with security.
o The Database can store information such as the cost of vegetables, milk, bread, etc.
In DBMS (Database Management System), the entities look like real-world entities.
TRC uses tuple variables to represent rows (tuples) in a relation and checks if a tuple
satisfies a given predicate.
DRC uses domain variables to represent attribute values and checks if a combination of
attribute values satisfies a predicate.
Many of the calculus expressions involves the use of Quantifiers. There are two types of
quantifiers:
o Universal Quantifiers: The universal quantifier denoted by ∀ is read as for all which
means that in a given set of tuples exactly all tuples satisfy a given condition.
DBMS stores data on physical storage devices, such as main memory (RAM) and secondary
(external) storage (hard drives, SSDs).
Sizing Information: Details about the size of data (e.g., bytes per row, number
of rows).
Main Memory (RAM): Very fast but volatile storage (data is lost when power
is off).
Secondary Storage (Hard Drives, SSDs): More persistent storage, but slower
than RAM.
Sequential File organization is the easiest type of file organization in which the
files are sequentially stored one after the other; rather than storing the various
records of the files in rows and column format(tabular form), it stores the records
in a single row.
It is the easiest sequential file organization method in which the records are stored on a first
come basis, meaning whichever records come first would be stored first in the sequence.
There is no fixed sequence. In this method, the order in which the records come decides the
order in which they will be stored.
As the name suggests in the method, files are stored in some sorted format( ascending or
descending). Order can be defined by a primary key or any other key/attribute.
Bucket: In the hash table where the data record is stored, a bucket is a memory
index. Typically, a disk block that holds numerous records is stored in these
buckets. Another name for it is the hash index.
Static Hashing
Dynamic Hashing
Static hashing in a Database Management System (DBMS) is a technique where the size
and structure of the hash table are fixed when it is created. Here are some key points
about static hashing:
Dynamic hashing is a technique used in DBMS that handles the limitations of static
hashing like bucket overflow. Here are the key aspects of dynamic hashing:
Directory: A directory is used to keep track of the buckets. The directory itself can
grow or shrink dynamically.
Hash Function: The hash function generates a hash value, and the directory uses a
certain number of bits from this hash value to determine the bucket address.
Bucket Splitting: When a bucket overflows, it is split into two, and the directory is
updated to reflect this change. This helps in distributing the records more evenly.
There are several processes and algorithms available to convert ER Diagrams into
Relational Schema. Some of them are automated and some of them are manual. We may
focus here on the mapping diagram contents to relational basics.
Mapping Entity
Entity's attributes should become fields of tables with their respective data types.
Declare primary key.
Mapping Relationship
Mapping Process
Add the primary keys of all participating Entities as fields of table with their
respective data types.
Declare a primary key composing all the primary keys of participating entities.
A weak entity set is one which does not have any primary key associated with it.
Mapping Process
Create table for weak entity set.
Mapping Process
Declare primary key of higher-level table and the primary key for lower-level table.
employee_roles Table
employee_i
d job_code
E001 J01
E001 J02
E002 J02
E002 J03
E003 J01
employees Table
state_cod
employee_id name e home_state
jobs table
job_cod
e job
J01 Chef
J02 Waiter
J03 Bartender
home_state is now dependent on state_code. So, if you know the state_code, then you
can find the home_state value.
employee_roles Table
employee_i
d job_code
E001 J01
E001 J02
E002 J02
E002 J03
E003 J01
employees Table
employee_i
d name state_code
E001 Alice 26
E002 Bob 56
E003 Alice 56
jobs Table
job_cod
e job
J01 Chef
J02 Waiter
J03 Bartender
states Table
state_code home_state
26 Michigan
56 Wyoming
BCNF (Boyce Codd Normal Form) is the advanced version of 3NF. A table is in BCNF if every
functional dependency X->Y, X is the super key of the table. For BCNF, the table should be in
3NF, and for every FD. LHS is super key.
Example
jhansi K.Das C
subbu R.Prasad C