Unit 2
Unit 2
An Entity-Relationship Model represents the structure of the database with the help
of a diagram. ER Modelling is a systematic process to design a database as it would
require you to analyze all data requirements before implementing your database.
History of ER models
Peter Chen proposed ER Diagrams in 1971 to create a uniform convention that can
be used as a conceptual modeling tool. Many models were presented and discussed,
but none were suitable. The data structure diagrams offered by Charles Bachman
also inspired his model.
• Lines: It links attributes to entity types and entity types with other
relationship types
• Entities
• Weak Entity
• Attributes
• Key Attribute
• Composite Attribute
• Multivalued Attribute
• Derived Attribute
• Relationships
• One-to-One Relationships
• One-to-Many Relationships
• Many-to-One Relationships
• Many-to-Many Relationships
Entities
For example, in a student study course, both the student and the course are entities.
Weak Entity
An entity that makes reliance over another entity is called a weak entity
In the example below, school is a strong entity because it has a primary key attribute
- school number. Unlike school, the classroom is a weak entity because it does not
have any primary key and the room number here acts only as a discriminator.
Attribute
Key Attribute
For example: For a student entity, the roll number can uniquely identify a student
from a set of students.
Composite Attribute
An oval showcases the composite attribute, and the composite attribute oval is
further connected with other ovals.
Multivalued Attribute
Some attributes can possess over one value, those attributes are called multivalued
attributes.
An attribute that can be derived from other attributes of the entity is known as a
derived attribute.
Relationship
One-to-One Relationship
For example, a student has only one identification card and an identification card is
given to one person.
One-to-Many Relationship
When a single element of an entity is associated with more than one element of
another entity, it is called a one-to-many relationship
For example, a customer can place many orders, but an order cannot be placed by
many customers.
Many-to-One Relationship
When more than one element of an entity is related to a single element of another
entity, then it is called a many-to-one relationship.
For example, students have to opt for a single course, but a course can have many
students.
Many-to-Many Relationship
When more than one element of an entity is associated with more than one element
of another entity, this is called a many-to-many relationship.
For example, you can assign an employee to many projects and a project can have
many employees.
How to Draw an ER Diagram?
• First, identify all the Entities. Embed all the entities in a rectangle and label
them properly.
• Make sure your ER Diagram supports all the data provided to design the
database.
1 X BOSE 20
2 Y GHOSH 22
StudentID FirstName LastName Age
3 Z MITRA 21
2. Row (Tuple):
• A row represents a single record or data point in a table.
• Each row contains values for each column defined in the table.
• Example: The first row in the "Students" table represents a student with
ID 1, named X BOSE, aged 20.
3. Column (Attribute):
• A column represents a specific attribute or field in a table.
• Each column has a name and a data type.
• Example: In the "Students" table, "StudentID," "FirstName,"
"LastName," and "Age" are columns.
4. Primary Key:
• A primary key is a unique identifier for each record in a table.
• It ensures that each row in the table can be uniquely identified.
• Example: In the "Students" table, the "StudentID" column can be a
primary key.
5. Foreign Key:
• A foreign key is a column or a set of columns in one table that refers to
the primary key of another table.
• It establishes a link between the two tables.
• Example: If there is another table called "Courses" with a "StudentID"
column, it could be a foreign key linking to the "Students" table.
6. Relationships:
• Relationships define how tables are related to each other.
• Common types of relationships include one-to-one, one-to-many, and
many-to-many.
• Example: In a university database, the "Students" table and the
"Courses" table may have a one-to-many relationship, where one
student can be enrolled in multiple courses.
7. Normalization:
• Normalization is the process of organizing data in a database to reduce
redundancy and dependency.
• It involves breaking down large tables into smaller, related tables.
• Example: If the "Students" table has both "HomeAddress" and
"SchoolAddress," normalization may involve creating a separate
"Addresses" table and linking it with foreign keys.
What is Relational Algebra in DBMS?
Basic/Fundamental Operations:
1. Select (σ)
2. Project (∏)
3. Union (∪)
4. Set Difference (-)
5. Cartesian product (X)
6. Rename (ρ)
Derived Operations:
Lets discuss these operations one by one with the help of examples.
Select Operator is denoted by sigma (σ) and it is used to find the tuples (or rows) in
a relation (or table) which satisfy the given condition.
If you understand little bit of SQL then you can think of it as a where clause in
SQL, which is used for the same purpose.
σ Condition/Predicate(Relation/Table name)
Select Operator (σ) Example
Table: CUSTOMER
---------------
σ Customer_City="Agra" (CUSTOMER)
Output:
Project operator is denoted by ∏ symbol and it is used to select desired columns (or
attributes) from a table (or relation).
Table: CUSTOMER
Customer_Name Customer_City
------------- -------------
Steve Agra
Raghu Agra
Chaitanya Noida
Ajeet Delhi
Carl Delhi
Union Operator (∪)
Union operator is denoted by ∪ symbol and it is used to select all the rows (tuples)
from two tables (relations).
Lets discuss union operator a bit more. Lets say we have two relations R1 and R2
both have same columns and we want to select all the tuples(rows) from these
relations then we can apply the union operator on these relations.
Note: The rows (tuples) that are present in both the tables will only appear once in
the union set. In short you can say that there are no duplicates present after the union
operation.
table_name1 ∪ table_name2
Union Operator (∪) Example
Table 1: COURSE
Student_Name
------------
Aditya
Carl
Paul
Lucy
Rick
Steve
Note: As you can see there are no duplicate names present in the output even though
we had few common names in both the tables, also in the COURSE table we had the
duplicate name itself.
Lets say we have two relations R1 and R2 both have same columns and we want to
select all those tuples(rows) that are present in both the relations, then in that case
we can apply intersection operation on these two relations R1 ∩ R2.
Note: Only those rows that are present in both the tables will appear in the result set.
table_name1 ∩ table_name2
Intersection Operator (∩) Example
Student_Name
------------
Aditya
Steve
Paul
Lucy
Set Difference (-)
Set Difference is denoted by – symbol. Lets say we have two relations R1 and R2
and we want to select all those tuples(rows) that are present in Relation R1
but not present in Relation R2, this can be done using Set difference R1 – R2.
table_name1 - table_name2
Set Difference (-) Example
Lets take the same tables COURSE and STUDENT that we have seen above.
Query:
Lets write a query to select those student names that are present in STUDENT table
but not present in COURSE table.
Student_Name
------------
Carl
Rick
Cartesian product (X)
Cartesian Product is denoted by X symbol. Lets say we have two relations R1 and
R2 then the cartesian product of these two relations (R1 X R2) would combine each
tuple of first relation R1 with the each tuple of second relation R2. I know it sounds
confusing but once we take an example of this, you will be able to understand this.
R1 X R2
Cartesian product (X) Example
Table 1: R
Col_A Col_B
----- ------
AA 100
BB 200
CC 300
Table 2: S
Col_X Col_Y
----- -----
XX 99
YY 11
ZZ 101
Query:
Lets find the cartesian product of table R and S.
RXS
Output:
Rename (ρ)
Lets say we have a table customer, we are fetching customer names and we are
renaming the resulted relation to CUST_NAMES.
Table: CUSTOMER
Customer_Id Customer_Name Customer_City
----------- ------------- -------------
C10100 Steve Agra
C10111 Raghu Agra
C10115 Chaitanya Noida
C10117 Ajeet Delhi
C10118 Carl Delhi
Query:
ρ(CUST_NAMES, ∏(Customer_Name)(CUSTOMER))
Output:
CUST_NAMES
----------
Steve
Raghu
Chaitanya
Ajeet
Carl
In Relational Algebra, Extended Operators are those operators that are derived
from the basic operators. We already have discussed Basic Operators in the previous
section. Now let us discuss the Extended Operators and how they are beneficial in
Relational Algebra.
There are mainly three types of extended operators in relational algebra, namely:
1. Intersection
2. Divide
3. Join
1) Intersection
Example
A–
1 Aashi 98
3 Anjali 79
4 Brijesh 88
B–
1 Aashi 98
2 Abhishek 87
3 Anjali 79
4 Brijesh 88
Query -
1 Aashi 98
3 Anjali 79
4 Brijesh 88
2) Divide
Divide operator is used for the queries that contain the keyword ALL.
Example
Find all the students who has chosen additional subjects Machine Learning and Data
Mining.
Student –
Subject –
Student Name
Machine Learning
Data Mining
Student
Ashish
Yash
3) Join
Join operation is as its name suggest, to join or combine two or more relations'
information. Join can also be defined as a cross-product followed by selection and
projection. There are several varieties of Join operation. Let's discuss all of them one
by one.
Student1 –
1 Ashish 98
2 Shivam 72
3 Tarun 53
4 Yash 89
Student2 –
4 Dinesh 79
5 Harsh 95
7 Kartik 88
When you want to join two relations based on the given condition, it is termed as
Condition Join. It is denoted by the symbol ⋈c.
For e.g. – Select the students from Student1 table whose RollNo is greater than the
RollNo of Student2 table.
Student1⋈cStudent1.RollNo>Student2.RollNoStudent2
Example
SELECT * FROM Student1, Student2 WHERE Student1.RollNo >
Student2.RollNo;
Output –
2 Shivam 72 1 Anjali 99
3 Tarun 53 1 Anjali 99
4 Yash 89 1 Anjali 99
It is a special case of Condition Join. When you want to join two relations based on
the equality condition, it is termed as Equi Join. It is denoted by the symbol ⋈.
For e.g. - Select the students from Student1 table whose RollNo is equalto the
RollNo of Student2 table.
Student1⋈Student1.RollNo=Student2.RollNoStudent2
Example
SELECT * FROM Student1, Student2 WHERE Student1.RollNo=Student2.RollNo;
Output –
1 Ashish 98 1 Anjali 99
4 Yash 89 4 Dinesh 79
Natural Join is that type of join in which equijoin is by default applied to all the
attributes in two or more relation. Its specialty is if you want to consider the equality
between two relations, you don't need to define the equality; it is predefined for all
the attributes if you use Natural Join. It is denoted by the symbol ⋈.
For e.g. - Select the students from Student1 table whose RollNo is equal to the
RollNo of Student2 table.
Student1⋈Student2
Example
SELECT * FROM Student1 NATURAL JOIN Student2;
Output –
1 Ashish 98 1 Anjali 99
4 Yash 89 4 Dinesh 79
It is a non-procedural query language with which we find tuples that hold true for a
given condition. These given conditions are called predicates. Tuple Relational
Calculus describes the desired information without providing specific information
for obtaining that information, i.e., specifies ‘what’ but not ‘how.’
It goes to each table row and checks whether the predicate condition is satisfied
(true) or not (false). It returns the tuple(s) which hold true for the predicate condition.
Syntax
{t \| P(t)}
Here, t represents the tuples returned as results, and P(t) is the predicate logic
condition or expression.
*Note: TRC is used as a theoretical foundation for optimizing the queries. However,
it cannot be executed in SQL-based RDBMS such as MySQL, SQLite, or
PostgreSQL.
Example
Consider the following Student table
Student_Name Course_Enrolled Age
Student_1 Java 18
Student_5 C++ 16
Student _3 Web development 20
Student_4 C++ 17
Query 1: Create a TRC query to get data of all the Student’s whose age>=17.
TRC Query
{t | t ∈ Student (t) ^ t.Age >= 17
Explanation: The tuple variable (t) goes to every tuple of the Student table. Each
row checks the age of the Student, and only those tuples are returned in the result
whose age is greater than or equal to 17.
The query can be interpreted as “Return all the tuples t which belong to table Student
and have an age greater than or equal to 17.”
Result
Explanation: The above query returns the Studen_Name for all the tuples which
have Course Enrolled as C++. The result is returned in the tuple t.
Result
Studen_Name
Student _5
Student _4
Domain Relational Calculus (DRC)
The domain of the attributes is used in Domain Relational Calculus. It uses domain
variables to get the column values needed based on predicate conditions.
Syntax:
{<x1,x2,...,xn> \| P(x1,x2,...,xn)}
Here, <x1,x2,...,xn> represents the domain variables used to get the column values,
and P(x1,x2,...,xn) is the predicate which is the condition for results.
*Note: DRC is a theoretical foundation for manipulating and describing data in a
relational database. However, it cannot be executed in SQL-based RDBMS such as
MySQL, SQLite, or PostgreSQL.
Example
Consider the following Student table
Student _id
CN_2
CN_4
Use Cases of Relational Calculus in DBMS
Below are the use cases of relational calculus in DBMS.
1. Relational Calculus helps in formulating complex queries which
involves multiple tables, aggregations and conditions. It helps the users
to retrieve the desired data from the database.
5 Marks
10 Marks